Investigating Rater Perceptions in the Assessment of Speaking

Authors

DOI:

https://doi.org/10.18778/1731-7533.20.3.04

Keywords:

Assessment, speaking, assessment of oral production, reliability, bias

Abstract

In the assessment of spoken production, numerous reasons can be identified behind the decisions that raters make in evaluating samples of oral performance. Inter and intra rater factors are relatively well documented in various reliability and validity studies. Some that have been identified in literature involve the effects of examinee pairing or the familiarity with the examinees, others point in the direction of gender and gender role perceptions O’Sullivan (2008), others appear to be connected with body language and non-verbal cues that accompany oral production (cf.: Krahmer and Swerts 2004, Seiter, Weger, Jensen and Kinzer 2010). While some studies that address the assessment of speaking English in exam contexts suggest that raters may not feel as comfortable assessing pronunciation as they do other aspects of a speaker’s performance (Orr 2002, Hubbard, Gilbert and Pidcock 2006, Brown 2006, De Velle 2008), more recent investigations of rater behaviour involving electronic evidence from training, maintenance and online examination programmes tentatively show that pronunciation, in fact, is the first category examiners attend to (Hubbard 2011, Chambers and Ingham 2011, Krakowian 2011, Seed 2012, Tynan 2015, Kang and Ginther 2019). This paper looks at large collection of assessments stored in an electronic system to investigate what raters really seem to pay attention to when allegedly following rating scales.

References

Bachman, L.F., & Palmer, A. (2002). Language Testing in Practice. Oxford University Press
Google Scholar

Barna, M.L. (1994). Stumbling Blocks in Intercultural Communication. In Samovar L.A. & R.E. Porter, (eds.), Intercultural Communication. Wadsworth.
Google Scholar

Bennet M.J. (1993). Towards Ethnorelativism: A Developmental Model of Intercultural Sensitivity. In Paige, R.M, (ed.), Education for the Intercultural Experience, pp. 21-71. Intercultural Press.
Google Scholar

Bond, T.G., & Fox, C.M. (2007). Applying the Rasch Model. Fundamental Measurement in the Human Sciences. University of Toledo Press
Google Scholar

Brown, A. (2006). An examination of the rating process in the revised IELTS Speaking Test., IELTS Research Reports, 6, 41-65. IELTS Australia, Canberra and British Council, London.
Google Scholar

Cavé, C., Guaïtella, I., & Santi, S. (2002). Eyebrow movements and voice variations in dialogue situations. In Hansen, J.H.L & Pellom, B., (eds.), Proceedings of the 7th International Conference on Spoken Language Processing, 2353–2356.
Google Scholar DOI: https://doi.org/10.21437/ICSLP.2002-225

Chambers, L., & Ingham, K. (2011). The BULATS Online Speaking Test., ESOL Research Notes, 34, 21-25.
Google Scholar

De Velle, S. (2008). The revised IELTS Pronunciation scale., ESOL Research Notes, 34, 36-38.
Google Scholar

Fulcher, G. (2003). Testing Second Language Speaking. Pearson Longman
Google Scholar

Guaïtella, I., Santi, S., Lagrue, B., & Cavé, Ch. (2009). Are Eyebrow Movements Linked to Voice Variations and Turn-taking in Dialogue? An Experimental Investigation. Language and Speech, 57, 207-222.
Google Scholar DOI: https://doi.org/10.1177/0023830909103167

Hall, E., & Hall, M. (1990). Understanding cultural differences: Germans, French and Americans. Intercultural Press
Google Scholar

Hall, E. (1959). The silent language. Doubleday
Google Scholar

Hall, E. (1966). The hidden dimension. Doubleday Anchor Books
Google Scholar

Hawkey, R. (2004). A Modular Approach to Testing English Language Skills: The development of the Certificates in English Language Skills, CELS, examinations. Cambridge ESOL Research Notes Volume 16
Google Scholar

Hildreth, P.M., & Kimble, Ch. (2004). Knowledge networks: innovation through communities of practice. Idea Group Inc., IGI.
Google Scholar DOI: https://doi.org/10.4018/978-1-59140-200-8

Hill, S.B., Wilson, S., Watson, K. (2004). Learning Ecology. A New Approach to Learning and Transforming Ecological Consciousness. In: O’Sullivan, E.V., Taylor, M.M., (eds.) Learning Toward an Ecological Consciousness: Selected Transformative Practices. Palgrave Macmillan, New York.
Google Scholar DOI: https://doi.org/10.1007/978-1-349-73178-7_4

Hubbard, C., Gilbert, S., & Pidcock, J. (2006). Assessment processes in speaking tests: a pilot verbal protocol study., ESOL Research Notes, 24, 14-19
Google Scholar

Hubbard, Ch. (2011). Cambridge ESOL Professional Support Network Extranet: Development and impact., ESOL Research Notes, 49, 17-26.
Google Scholar

Hymes, D. (1964). Introduction: Toward Ethnographies of Communication American Anthropologist, 66(6), pp. 1–34
Google Scholar DOI: https://doi.org/10.1525/aa.1964.66.suppl_3.02a00010

Hymes, D. (1972). On communicative competence. In J. B. Pride & Holmes, J., (eds.), Sociolinguistics, pp. 269–285. Penguin.
Google Scholar

Journal of Law 2007 no. 188, Section 1374: Rozporządzenie Ministra Nauki i Szkolnictwa Wyższego z Dnia 25 Września 2007 r. w sprawie warunków jakie muszą być Spełnione, aby zajęcia dydaktyczne na studiach mogły być prowadzone z wykorzystaniem metod i technik kształcenia na odległość, dz. u. 2007 nr. 188, poz. 1374 z Późn. Zm.
Google Scholar

Kang, O., & Ginther, A. (2019). Assessment in second language pronunciation. Taylor and Francis.
Google Scholar

Krahmer, E., & Swerts, M. (2004). More about brows. In Ruttkay, Z. & Pelachaud, C., (eds.), From brows to trust: Evaluating embodied conversational agents, pp.194–216. Kluwer Academic Press.
Google Scholar DOI: https://doi.org/10.1007/1-4020-2730-3_7

Krakowian, P. (2010). Modern Test Theory Explained. Scholar
Google Scholar

Krakowian, P. (2011). Investigating Rater Performance in Tests of Oral Expression., Wydawnictwo Uniwersytetu Łódzkiego
Google Scholar

Lustig, M. W., & Koester, J. (1993). Intercultural Competence. Interpersonal Communication across Cultures. Harper Collins College Publishers.
Google Scholar

Lustig, M. W., & Koester, J. (2009). Intercultural Competence: Interpersonal Communication Across Cultures, 6th Edition,. Allyn and Bacon
Google Scholar

Martinez, L. (2009). How Examiners of Different Severity Grade Candidates of Different Ability. Test Insights 2009. Measurement Research Associates, Inc.
Google Scholar

Martinez, L. (2010). The Relationship between Examiner Severity and Consistency. Test Insights 2010. Measurement Research Associates, Inc.
Google Scholar

Nakane, I. (2007). Silence in Intercultural Communication: perceptions and performance. John Benjamins.
Google Scholar DOI: https://doi.org/10.1075/pbns.166

National Research Council (NRC). (2015). Identifying and supporting productive programs in out-of-school settings. Washington, DC: National Academies Press.
Google Scholar

O’Sullivan, B. (2008). Modelling Performance in Tests of Spoken Language. Peter Lang
Google Scholar

Ockey, G. J. (2009). The effects of group members' personalities on a test taker's L2 group oral discussion test scores. Language Testing 2009, 26, 161-179.
Google Scholar DOI: https://doi.org/10.1177/0265532208101005

Orr, M. (2002). The FCE Speaking test: using rater reports to help interpret test scores., System, vol 30, no 2, pp 143-154
Google Scholar DOI: https://doi.org/10.1016/S0346-251X(02)00002-7

Saint-Onge, H., & Wallace, D. (2003). Leveraging communities of practice for strategic advantage. Butterworth-Heinemann
Google Scholar DOI: https://doi.org/10.1016/B978-0-7506-7458-4.50007-1

Scollon R., & Scollon, S.W. (1995). Intercultural Communication. Blackwell
Google Scholar

Seed, G. (2012). Perceptions of authenticity in academic test tasks., ESOL Research Notes, vol 49, pp. 17-26
Google Scholar

Seiter, J., Weger, H., Jensen, A., & Kinzer, H. (2010). The Role of Background Behavior in Televised Debates: Does Displaying Nonverbal Agreement and/or Disagreement Benefit Either Debater? Journal of Social Psychology, 150(3), 278–300.
Google Scholar DOI: https://doi.org/10.1080/00224540903510811

Taylor, L., & Falvey, P. (2007). IELTS Collected Papers: Research in speaking and writing assessment. Cambridge ESOL Research Notes Volume 19
Google Scholar

Tynan, R. (2015). Creating ePortfolios to facilitate and evidence progress using learning technologies., Cambridge Exams Research Notes, 61, 55-68
Google Scholar

Weir, C., & Milanovic, M. (2003). Continuity and Innovation: Revising the Cambridge Proficiency in English Examination 1913-2002 Cambridge ESOL Research Notes, 15.
Google Scholar

Wenger, E. (1998). Communities of Practice: Learning, Meaning, and Identity. Cambridge University Press.
Google Scholar DOI: https://doi.org/10.1017/CBO9780511803932

Wenger, E., McDermott, R.A., & Snyder, W. (2002). Cultivating communities of practice: a guide to managing knowledge. Harvard Business Press
Google Scholar

Wilson, M. (2005). Constructing Measures: An Item Response Model Lawrence Erlbaum Associates
Google Scholar

Wright, B.D., & Masters, G. (1982). Rating Scale Analysis. MESA Press
Google Scholar

Wright, B.D., & Stone, M.H. (1979). Best test design. MESA Press
Google Scholar

Downloads

Published

2023-02-09

How to Cite

Krakowian, P. (2023). Investigating Rater Perceptions in the Assessment of Speaking. Research in Language, 20(3), 277–289. https://doi.org/10.18778/1731-7533.20.3.04

Issue

Section

Articles