Building a corpus of spoken Galician language. The CORILGA project

Authors

  • José Manuel Dopazo Entenza Instituto da Lingua Galega (Universidade de Santiago de Compostela), España

DOI:

https://doi.org/10.18778/2392-0718.05.04

Keywords:

CORILGA, language variation and change, oral corpus, speech recognition

Abstract

The CORILGA (Corpus Oral Informatizado de la Lengua Gallega) is a corpus of recordings aligned with their transcription and annotated at different levels (spelling, phonetic, morphological, syntactic...). A complete and thorough recordings and participants data allows, through an online open search engine, to get very accurate search results. This information could be used in language variation and change studies and to create materials for teaching or developing speech technology.

References

AGO = FERNÁNDEZ REI, F. (dir.) (2010-): Arquivo do Galego Oral. Santiago de Compostela: Instituto da Lingua Galega. http://ilg.usc.es/ago/ [Consultado: <08/03/2016>]
Google Scholar

AMPER-Galicia = FERNÁNDEZ REI, E. (dir.) (2006-): Atlas Multimedia Prosódico do Espazo Románico. Santiago de Compostela: Instituto da Lingua Galega. http://ilg.usc.es/amper/ [Consultado: <08/03/2016>]
Google Scholar

BRUGMAN, H. y A. RUSSEL (2004): “Annotating Multimedia/ Multi-modal resources with ELAN”. Proceedings of LREC 2004, Fourth International Conference on Language Resources and Evaluation. https://tla.mpi.nl/tools/tlatools/elan/citing_elan/ [Consultado: <14/03/2015>]
Google Scholar

CORILGA = REGUEIRA FERNÁNDEZ, X. L. (dir.) (2012-): Corpus Oral Informatizado da Lingua Galega. Santiago de Compostela: Instituto da Lingua Galega. http://ilg.usc.es/corilga/ [Consultado: <08/03/2016>]
Google Scholar

DUBERT GARCÍA, F. X. (1998): A fala de Santiago de Compostela. Estudio xeolingüístico. Tesis doctoral dirigida por Rosario Álvarez Blanco. Universidade de Santiago de Compostela (inédita)
Google Scholar

GARCÍA MATEO, C., A. CARDENAL LÓPEZ, X. L. REGUEIRA FERNÁNDEZ, E. FERNÁNDEZ REI, M. MARTÍNEZ MAQUIEIRA, R. SEARA DOPAZO, R. VARELA FERNÁNDEZ & N. BASANTA LLANES (2014): “CORILGA: a Galician Multilevel Annotated Speech Corpus for Linguistic Analysis”. Proc. 9th Language Resurces and Evaluation Conference (LREC2014). Reykjavik, 26-31 May 2014.
Google Scholar

KUCERA, H. & N. FRANCIS (1967): Computational Analysis of Present-Day American English. Michigan: Brown University Press
Google Scholar

REGUEIRA FERNÁNDEZ, X. L. (1989): A fala do norte da Terra Cha: estudio descritivo. Tesis doctoral dirigida por Antón L. Santamarina Fernández. Universidade de Santiago de Compostela (inédita).
Google Scholar

SANKOFF, D. & G. SANKOFF (1973): “Sample Survey Methods and Computer-Assisted Analysis in the Study of Grammatical Variation”. Canadian Languages in Their Social Context Edmonton: Linguistic Research Incorporated. R. DARNELL (ed.). Edmonton: Linguistic Research Incorporated, pp. 7–64.
Google Scholar

SANTAMARINA FERNÁNDEZ, A. L., R. ÁLVAREZ BLANCO, F. FERNÁNDEZ REI & M. GONZÁLEZ GONZÁLEZ (1990-2015): Atlas Lingüístico Galego. Vol. I-VI. A Coruña: Fundación Pedro Barrié de la Maza.
Google Scholar

Published

2018-03-31

How to Cite

Dopazo Entenza, J. M. (2018). Building a corpus of spoken Galician language. The CORILGA project. E-Scripta Romanica, 5, 28–38. https://doi.org/10.18778/2392-0718.05.04