Alignment in ASR and L1 Listeners’ Recognition of L2 Learner Speech: French EFL Learners & Dictation.Io

Authors

DOI:

https://doi.org/10.18778/1731-7533.21.3.03

Keywords:

English, Automatic Speech Recognition, L2 learner speech, replication, intelligibility, comprehensibility

Abstract

This study is an extension of Inceoglu et al.’s (2023) study on Google Voice Typing as a pronunciation learning tool. We used the Automatic Speech Recognition (ASR) tool on the dictation.io website (Agarwal, 2022), and our participants were L2 English learners of a different L1, but similar proficiency level. Twelve L1 English listeners assessed the L2 English from four L1 French speakers in terms of intelligibility and comprehensibility, measured by word transcription and Likert scale ratings respectively. Their scores were compared to ASR output. The goal was to determine how accurate the tool is, and to what extent its accuracy correlates with human listeners. The results were generally consistent with those of Inceoglu et al. (2023), with few exceptions which we discuss in the current study.

References

Agarwal, A. (2022). dictation.io [Online app]. Digital Inspiration. https://dictation.io/
Google Scholar

Boersma, P. and D. Weenink. 2022. Praat: Doing Phonetics by Computer [Computer program]. Version 6.3.09, retrieved 14 August 2022 from http://www.praat.org/
Google Scholar

Brown, A. (1988). Functional Load and the Teaching of Pronunciation. TESOL Quarterly, 22(4), 593. https://doi.org/10.2307/3587258; DIALANG 2022. [https://dialangweb.lancaster.ac.uk/]
Google Scholar DOI: https://doi.org/10.2307/3587258

Coulange, S. 2023. Computer Aided Pronunciation Training in 2022: When Pedagogy Struggles to Catch Up. In A. Henderson and A. Kirkova-Naskova (Eds.), Proceedings from the 7th International Conference English Pronunciation: Issues & Practices. Université Grenoble-Alpes, May 2022. (pp11-22), Grenoble, France. https://hal.science/hal-04159763
Google Scholar

Derwing, T. M. 2010. Utopian Goals for Pronunciation Teaching. In J. Levis and K. LeVelle (Eds.), Proceedings of the 1st Pronunciation in Second Language Learning and Teaching Conference. Iowa State University, Sept. 2009. (pp.24-37), Ames, IA: Iowa State University. https://www.iastatedigitalpress.com/psllt/article/id/15147/
Google Scholar

Fouz-González, J. 2015. Trends and Directions in Computer-Assisted Pronunciation Training. In J.A. Mompéan and J. Fouz-González (Eds), Investigating English Pronunciation: Trends and Directions. Basingstoke and New York: Palgrave Macmillan: 314–342. https://doi.org/10.1057/9781137509437_14
Google Scholar DOI: https://doi.org/10.1057/9781137509437_14

Golonka, E. M., A. R. Bowles, V. M. Frank, D. L. Richardson, and S. Freynik. 2014. Technologies for Foreign Language Learning: A Review of Technology Types and Their Effectiveness. Computer Assisted Language Learning 27(1): 70–105. https://doi.org/10.1080/09588221.2012.700315
Google Scholar DOI: https://doi.org/10.1080/09588221.2012.700315

Henrichsen, L. E. 2021. An Illustrated Taxonomy of Online CAPT Resources. RELC Journal 52(1): 179-188. https://doi.org/10.1177/0033688220954560 Inceoglu, S., W-H. Chen, and H. Lim. 2023. Assessment of L2 Intelligibility: Comparing L1 Listeners and Automatic Speech Recognition. ReCALL 35(1): 89-104. https://doi:10.1017/S0958344022000192
Google Scholar DOI: https://doi.org/10.1017/S0958344022000192

Inceoglu, S., H. Lim, and W-H. Chen. 2020. ASR for EFL Pronunciation Practice: Segmental Development and Learners' Beliefs. The Journal of Asia TEFL 17(3): 824-840. https://doi.org/10.18823/asiatefl.2020.17.3.5.824
Google Scholar DOI: https://doi.org/10.18823/asiatefl.2020.17.3.5.824

Jułkowska, I. A., & Cebrian, J. (2015). Effects of Listener Factors and Stimulus Properties on the Intelligibility, Comprehensibility and Accentedness of L2 Speech. Journal of Second Language Pronunciation, 1(2), 211–237. https://doi.org/10.18823/asiatefl.2020.17.3.5.824
Google Scholar DOI: https://doi.org/10.1075/jslp.1.2.04jul

Kennedy, S. and P. Trofimovich. 2008. Intelligibility, Comprehensibility and Accentedness of L2 Speech: The Role of Listener Experience and Semantic Context. Canadian Modern Language Review 64(3): 459–489. https://doi.org/10.3138/cmlr.64.3.459
Google Scholar DOI: https://doi.org/10.3138/cmlr.64.3.459

Kivistö de Souza, H. and W. Gottardi. 2022. How Well Can ASR Technology Understand Foreign-accented Speech? Trabalhos Em Linguística Aplicada, 61(3):, 764-781. https://doi.org/10.1590/010318138668782v61n32022
Google Scholar DOI: https://doi.org/10.1590/010318138668782v61n32022

Landis, J. R. and G. G. Koch. 1977. An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers. International Biometric Society 33(2): 363-374. https://doi.org/10.2307/2529786
Google Scholar DOI: https://doi.org/10.2307/2529786

Levis, J.W. and R. Suvorov. 2020. Automatic Speech Recognition. In C.A. Chapelle (ed), The Concise Encyclopedia of Applied Linguistics. Hoboken: Wiley-Blackwell: 149–156. https://doi.org/10.1002/9781405198431.wbeal0066.pub2
Google Scholar DOI: https://doi.org/10.1002/9781405198431.wbeal0066.pub2

Liakin, D., W. Cardoso, and N. Liakina. 2017. Mobilizing Instruction in a Second-Language Context: Learners’ Perceptions of Two Speech Technologies. Languages 2(3): 11. https://doi.org/10.3390/languages2030011
Google Scholar DOI: https://doi.org/10.3390/languages2030011

McCrocklin, S. and I. Edalatishams. 2020. Revisiting popular speech recognition software for ESL speech. TESOL Quarterly 54(4): 1086–1097. https://doi.org/10.1002/tesq.3006
Google Scholar DOI: https://doi.org/10.1002/tesq.3006

McCrocklin, S., A. Humaidan, and I. Edalatishams. 2019. ASR Dictation Program Accuracy: Have Current Programs Improved? In J. M. Levis, C. Nagle, and E. Todey (Ed.s), Proceedings of the 10th Pronunciation in Second Language Learning and Teaching Conference. Ames, IA, Sept.2018. (pp. 191–200). Ames, IA: Iowa State University. https://iastate.box.com/shared/static/wtnv3yg890ze2ibtkihwdpts7bfojt8h.pdf
Google Scholar

Moussalli, S. and W. Cardoso. 2020. I Computer Assisted Language Learning 33(8): 865-890. https://doi.org/10.1080/09588221.2019.1595664
Google Scholar DOI: https://doi.org/10.1080/09588221.2019.1595664

Mroz, A. 2018. Seeing How People Hear You: French Learners Experiencing Intelligibility through Automatic Speech Recognition. Foreign Language Annals 51(3): 617–637. https://doi.org/10.1111/flan.12348
Google Scholar DOI: https://doi.org/10.1111/flan.12348

Munro, M. J. 2019, December 12. Where to Next? Thoughts on the Future of Pronunciation Research. [Plenary]. 13th International Conference on Native and Non-native Accents of English, University of Łódź, Poland.
Google Scholar

Munro, M. J. 2021. On the Difficulty of Defining “Difficult” in Second-Language Vowel Acquisition. Frontiers in Communication 6: 1–15 https://doi.org/10.3389/fcomm.2021.639398
Google Scholar DOI: https://doi.org/10.3389/fcomm.2021.639398

Munro, M. J. and T. M. Derwing. 1995a. Processing Time, Accent, and Comprehensibility in the Perception of Native and Foreign-accented Speech. Language and Speech 38(3): 289–306. https://doi.org/10.1177/002383099503800305
Google Scholar DOI: https://doi.org/10.1177/002383099503800305

Munro, M. J. and T. M. Derwing,. 1995b. Foreign Accent, Comprehensibility, and Intelligibility in the Speech of Second Language Learners. Language Learning 45(1): 73–97. https://doi.org/10.1111/j.1467-1770.1995.tb00963.x
Google Scholar DOI: https://doi.org/10.1111/j.1467-1770.1995.tb00963.x

Peirce, J. W., J. R. Gray, S. Simpson, M. R. MacAskill, , R. Höchenberger, H. Sogo, E. Kastman, and J. K. Lindeløv. 2019. PsychoPy2: Experiments in Behavior Made Easy. Behavior Research Methods 51(1): 195–203. https://doi.org/10.3758/s13428-018-01193-y
Google Scholar DOI: https://doi.org/10.3758/s13428-018-01193-y

Putri Yaniafari, R., V. Olivia, and Suharayadi. 2022. The Potential of ASR for Improving English Pronunciation : A Review. KnE Social Sciences 7(7): 281-289. https://doi.org/10.18502/kss.v7i7.10670
Google Scholar DOI: https://doi.org/10.18502/kss.v7i7.10670

R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Google Scholar

Suzukida, Y., & Saito, K. 2022. What Is Second Language Pronunciation Proficiency? An Empirical Study. System, 106: 102754. https://doi.org/10.1016/j.system.2022.102754
Google Scholar DOI: https://doi.org/10.1016/j.system.2022.102754

Suzukida, Y., & Saito, K. 2021. Which Segmental Features Matter for Successful L2 Comprehensibility? Revisiting and Generalizing the Pedagogical Value of the Functional Load Principle. Language Teaching Research 25(3): 431–450. https://doi.org/10.1177/1362168819858246
Google Scholar DOI: https://doi.org/10.1177/1362168819858246

Swan, M and B. Smith. 2001. Learner English: A Teacher’s Guide to Interference and Other Problems (2nd ed). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511667121
Google Scholar DOI: https://doi.org/10.1017/CBO9780511667121

Thomson, R. I. 2018. Measurement of Accentedness, Intelligibility and Comprehensibility. In Kang, O., & Ginther, A. (Eds.). Assessment in second language pronunciation. (pp. 11-29). Routledge. https://doi.org/10.4324/9781315170756-2
Google Scholar DOI: https://doi.org/10.4324/9781315170756-2

Verdugo, D. R. 2006. A Study of Intonation Awareness and Learning in Non-native Speakers of English. Language Awareness 15(3): 141–159.
Google Scholar DOI: https://doi.org/10.2167/la404.0

Downloads

Published

2023-12-28

How to Cite

Chanethom, V., & Henderson, A. (2023). Alignment in ASR and L1 Listeners’ Recognition of L2 Learner Speech: French EFL Learners & Dictation.Io. Research in Language, 21(3), 245–266. https://doi.org/10.18778/1731-7533.21.3.03

Issue

Section

Articles