Investigating Rater Perceptions in the Assessment of Speaking
DOI:
https://doi.org/10.18778/1731-7533.20.3.04Keywords:
Assessment, speaking, assessment of oral production, reliability, biasAbstract
In the assessment of spoken production, numerous reasons can be identified behind the decisions that raters make in evaluating samples of oral performance. Inter and intra rater factors are relatively well documented in various reliability and validity studies. Some that have been identified in literature involve the effects of examinee pairing or the familiarity with the examinees, others point in the direction of gender and gender role perceptions O’Sullivan (2008), others appear to be connected with body language and non-verbal cues that accompany oral production (cf.: Krahmer and Swerts 2004, Seiter, Weger, Jensen and Kinzer 2010). While some studies that address the assessment of speaking English in exam contexts suggest that raters may not feel as comfortable assessing pronunciation as they do other aspects of a speaker’s performance (Orr 2002, Hubbard, Gilbert and Pidcock 2006, Brown 2006, De Velle 2008), more recent investigations of rater behaviour involving electronic evidence from training, maintenance and online examination programmes tentatively show that pronunciation, in fact, is the first category examiners attend to (Hubbard 2011, Chambers and Ingham 2011, Krakowian 2011, Seed 2012, Tynan 2015, Kang and Ginther 2019). This paper looks at large collection of assessments stored in an electronic system to investigate what raters really seem to pay attention to when allegedly following rating scales.
References
Bachman, L.F., & Palmer, A. (2002). Language Testing in Practice. Oxford University Press
Barna, M.L. (1994). Stumbling Blocks in Intercultural Communication. In Samovar L.A. & R.E. Porter, (eds.), Intercultural Communication. Wadsworth.
Bennet M.J. (1993). Towards Ethnorelativism: A Developmental Model of Intercultural Sensitivity. In Paige, R.M, (ed.), Education for the Intercultural Experience, pp. 21-71. Intercultural Press.
Bond, T.G., & Fox, C.M. (2007). Applying the Rasch Model. Fundamental Measurement in the Human Sciences. University of Toledo Press
Brown, A. (2006). An examination of the rating process in the revised IELTS Speaking Test., IELTS Research Reports, 6, 41-65. IELTS Australia, Canberra and British Council, London.
Cavé, C., Guaïtella, I., & Santi, S. (2002). Eyebrow movements and voice variations in dialogue situations. In Hansen, J.H.L & Pellom, B., (eds.), Proceedings of the 7th International Conference on Spoken Language Processing, 2353–2356. DOI: https://doi.org/10.21437/ICSLP.2002-225
Chambers, L., & Ingham, K. (2011). The BULATS Online Speaking Test., ESOL Research Notes, 34, 21-25.
De Velle, S. (2008). The revised IELTS Pronunciation scale., ESOL Research Notes, 34, 36-38.
Fulcher, G. (2003). Testing Second Language Speaking. Pearson Longman
Guaïtella, I., Santi, S., Lagrue, B., & Cavé, Ch. (2009). Are Eyebrow Movements Linked to Voice Variations and Turn-taking in Dialogue? An Experimental Investigation. Language and Speech, 57, 207-222. DOI: https://doi.org/10.1177/0023830909103167
Hall, E., & Hall, M. (1990). Understanding cultural differences: Germans, French and Americans. Intercultural Press
Hall, E. (1959). The silent language. Doubleday
Hall, E. (1966). The hidden dimension. Doubleday Anchor Books
Hawkey, R. (2004). A Modular Approach to Testing English Language Skills: The development of the Certificates in English Language Skills, CELS, examinations. Cambridge ESOL Research Notes Volume 16
Hildreth, P.M., & Kimble, Ch. (2004). Knowledge networks: innovation through communities of practice. Idea Group Inc., IGI. DOI: https://doi.org/10.4018/978-1-59140-200-8
Hill, S.B., Wilson, S., Watson, K. (2004). Learning Ecology. A New Approach to Learning and Transforming Ecological Consciousness. In: O’Sullivan, E.V., Taylor, M.M., (eds.) Learning Toward an Ecological Consciousness: Selected Transformative Practices. Palgrave Macmillan, New York. DOI: https://doi.org/10.1007/978-1-349-73178-7_4
Hubbard, C., Gilbert, S., & Pidcock, J. (2006). Assessment processes in speaking tests: a pilot verbal protocol study., ESOL Research Notes, 24, 14-19
Hubbard, Ch. (2011). Cambridge ESOL Professional Support Network Extranet: Development and impact., ESOL Research Notes, 49, 17-26.
Hymes, D. (1964). Introduction: Toward Ethnographies of Communication American Anthropologist, 66(6), pp. 1–34 DOI: https://doi.org/10.1525/aa.1964.66.suppl_3.02a00010
Hymes, D. (1972). On communicative competence. In J. B. Pride & Holmes, J., (eds.), Sociolinguistics, pp. 269–285. Penguin.
Journal of Law 2007 no. 188, Section 1374: Rozporządzenie Ministra Nauki i Szkolnictwa Wyższego z Dnia 25 Września 2007 r. w sprawie warunków jakie muszą być Spełnione, aby zajęcia dydaktyczne na studiach mogły być prowadzone z wykorzystaniem metod i technik kształcenia na odległość, dz. u. 2007 nr. 188, poz. 1374 z Późn. Zm.
Kang, O., & Ginther, A. (2019). Assessment in second language pronunciation. Taylor and Francis.
Krahmer, E., & Swerts, M. (2004). More about brows. In Ruttkay, Z. & Pelachaud, C., (eds.), From brows to trust: Evaluating embodied conversational agents, pp.194–216. Kluwer Academic Press. DOI: https://doi.org/10.1007/1-4020-2730-3_7
Krakowian, P. (2010). Modern Test Theory Explained. Scholar
Krakowian, P. (2011). Investigating Rater Performance in Tests of Oral Expression., Wydawnictwo Uniwersytetu Łódzkiego
Lustig, M. W., & Koester, J. (1993). Intercultural Competence. Interpersonal Communication across Cultures. Harper Collins College Publishers.
Lustig, M. W., & Koester, J. (2009). Intercultural Competence: Interpersonal Communication Across Cultures, 6th Edition,. Allyn and Bacon
Martinez, L. (2009). How Examiners of Different Severity Grade Candidates of Different Ability. Test Insights 2009. Measurement Research Associates, Inc.
Martinez, L. (2010). The Relationship between Examiner Severity and Consistency. Test Insights 2010. Measurement Research Associates, Inc.
Nakane, I. (2007). Silence in Intercultural Communication: perceptions and performance. John Benjamins. DOI: https://doi.org/10.1075/pbns.166
National Research Council (NRC). (2015). Identifying and supporting productive programs in out-of-school settings. Washington, DC: National Academies Press.
O’Sullivan, B. (2008). Modelling Performance in Tests of Spoken Language. Peter Lang
Ockey, G. J. (2009). The effects of group members' personalities on a test taker's L2 group oral discussion test scores. Language Testing 2009, 26, 161-179. DOI: https://doi.org/10.1177/0265532208101005
Orr, M. (2002). The FCE Speaking test: using rater reports to help interpret test scores., System, vol 30, no 2, pp 143-154 DOI: https://doi.org/10.1016/S0346-251X(02)00002-7
Saint-Onge, H., & Wallace, D. (2003). Leveraging communities of practice for strategic advantage. Butterworth-Heinemann DOI: https://doi.org/10.1016/B978-0-7506-7458-4.50007-1
Scollon R., & Scollon, S.W. (1995). Intercultural Communication. Blackwell
Seed, G. (2012). Perceptions of authenticity in academic test tasks., ESOL Research Notes, vol 49, pp. 17-26
Seiter, J., Weger, H., Jensen, A., & Kinzer, H. (2010). The Role of Background Behavior in Televised Debates: Does Displaying Nonverbal Agreement and/or Disagreement Benefit Either Debater? Journal of Social Psychology, 150(3), 278–300. DOI: https://doi.org/10.1080/00224540903510811
Taylor, L., & Falvey, P. (2007). IELTS Collected Papers: Research in speaking and writing assessment. Cambridge ESOL Research Notes Volume 19
Tynan, R. (2015). Creating ePortfolios to facilitate and evidence progress using learning technologies., Cambridge Exams Research Notes, 61, 55-68
Weir, C., & Milanovic, M. (2003). Continuity and Innovation: Revising the Cambridge Proficiency in English Examination 1913-2002 Cambridge ESOL Research Notes, 15.
Wenger, E. (1998). Communities of Practice: Learning, Meaning, and Identity. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511803932
Wenger, E., McDermott, R.A., & Snyder, W. (2002). Cultivating communities of practice: a guide to managing knowledge. Harvard Business Press
Wilson, M. (2005). Constructing Measures: An Item Response Model Lawrence Erlbaum Associates
Wright, B.D., & Masters, G. (1982). Rating Scale Analysis. MESA Press
Wright, B.D., & Stone, M.H. (1979). Best test design. MESA Press
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
