Badania dyskursu wspomagane korpusowo (CADS)  jako wsparcie jakościowej analizy treści. Studium przypadku wykorzystania programu SketchEngine w badaniach dyskursu

Marek Troszyński

doi:10.18778/1733-8069.20.4.03

Autor

Marek Troszyński Collegium Civitas w Warszawie https://orcid.org/0000-0002-3653-4018

DOI:

https://doi.org/10.18778/1733-8069.20.4.03

Słowa kluczowe:

lingwistyka korpusowa, SketchEngine, jakościowa analiza treści, metody mieszane

Abstrakt

Artykuł przedstawia możliwość wykorzystania narzędzi lingwistyki korpusowej jako pierwszego etapu jakościowej analizy treści. W tekście omówiony jest rozwój metody badań dyskursu wspomaganych korpusem (CADS). Zasadnicza część artykułu to omówienie funkcji wybranego programu wspomagających CADS – SketchEngine. W tekście znajdziemy liczne przykłady, które objaśniają sposoby wykorzystania metod CADS i funkcjonalności SketchEngine dla analizy polskiego dyskursu prasowego. Dzięki możliwości łatwego odniesienia do tekstów źródłowych (konkordancje) SketchEngine pozwala na włączenie metod mieszanych do badań dyskursu.

Biogram autora

Marek Troszyński - Collegium Civitas w Warszawie

Doktor, socjolog, prowadzi Obserwatorium Cywilizacji Cyfrowej w Collegium Civitas w Warszawie. Bada dyskurs medialny dotyczący migrantów i uchodźców w Polsce oraz język komunikatów z wojny w Ukrainie. W pracy naukowej zajmuje się także zagadnieniami mowy nienawiści wobec mniejszości w Polsce. W badaniach wykorzystuje metody lingwistyki korpusowej (CL) oraz narzędzia automatycznej analizy języka naturalnego (NLP).

Bibliografia

Baker Paul (2004), Querying Keywords: Questions of Difference, Frequency, and Sense in Keywords Analysis, „Journal of English Linguistics”, vol. 32(4), s. 346–359, https://doi.org/10.1177/0075424204269894 DOI: https://doi.org/10.1177/0075424204269894

Baker Paul (2006), Using corpora in discourse analysis, London–New York: Continuum. DOI: https://doi.org/10.5040/9781350933996

Baker Paul, Mcenery Tony (2005), A corpus-based approach to discourses of refugees and asylum seekers in UN and newspaper texts, „Journal of Language and Politics”, vol. 4(2), s. 197–226. DOI: https://doi.org/10.1075/jlp.4.2.04bak

Baker Paul, Gabrielatos Costas, Khosravinik Majid, Mcenery Tony, Wodak Ruth (2008), A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK press, „Discourse & Society”, vol. 19(3), s. 273–306, https://doi.org/10.1177/0957926508088962 DOI: https://doi.org/10.1177/0957926508088962

Bednarek Monika (2006), Evaluation in Media Discourse: Analysis of a Newspaper Corpus, London–New York: Continuum.

Chen Yingying, Peng Zhao, Kim Sei Hill, Choi Chang Won (2023), What We Can Do and Cannot Do with Topic Modeling: A Systematic Review, „Communication Methods and Measures”, vol. 17(2), s. 111–130, https://doi.org/10.1080/19312458.2023.2167965 DOI: https://doi.org/10.1080/19312458.2023.2167965

CLARIN-PL (b.r.), https://ws.clarin-pl.eu/ [dostęp: 10.05.2024].

CLARIN-PL (b.r.), Login, https://services.clarin-pl.eu/services [dostęp: 10.05.2024].

Costa Antonio Pedro, Moreira Antonio, Freitas Fabio, Costa King, Bryda Grzegorz (red.) (2023), Computer Supported Qualitative Research, Cham: Springer International Publishing, https://doi.org/10.1007/978-3-031-31346-2 DOI: https://doi.org/10.1007/978-3-031-31346-2

Creswell John W. (2009), Editorial: Mapping the Field of Mixed Methods Research, „Journal of Mixed Methods Research”, vol. 3(2), s. 95–108, https://doi.org/10.1177/1558689808330883 DOI: https://doi.org/10.1177/1558689808330883

Efe İbrahim (2019), A corpus-driven analysis of representations of Syrian asylum seekers in the Turkish press 2011–2016, „Discourse and Communication”, vol. 13(1), s. 48–67, https://doi.org/10.1177/1750481318801624 DOI: https://doi.org/10.1177/1750481318801624

Egbert Jesse, Biber Douglas (2018), Incorporating text dispersion into keyword analyses, „Corpora”, vol. 14(1), s. 77–104, https://doi.org/10.3366/cor.2019.0162 DOI: https://doi.org/10.3366/cor.2019.0162

Egbert Jesse, Larsson Tove, Biber Douglas (2020), Doing Linguistics with a Corpus. Methodological Considerations for the Everyday User, Cambridge: Cambridge University Press, https://doi.org/10.1017/9781108888790 DOI: https://doi.org/10.1017/9781108888790

Fairclough Norman (2000), New Labour, New Language?, London: Routledge.

Gabrielatos Costas (2018), Keyness analysis: Nature, metrics and techniques, [w:] Charlotte Taylor, Anna Marchi (red.), Corpus Approaches To Discourse: A critical review, Oxford: Routledge, s. 225–258. DOI: https://doi.org/10.4324/9781315179346-11

Gabrielatos Costas, Baker Paul (2008), Fleeing, Sneaking, Flooding: A Corpus Analysis of Discursive Constructions of Refugees and Asylum Seekers in the UK Press, 1996–2005, „Journal of English Linguistics”, vol. 36(1), s. 5–38, https://doi.org/10.1177/0075424207311247 DOI: https://doi.org/10.1177/0075424207311247

Gabrielatos Costas, Marchi Anna (2012), Keyness: Appropriate metrics and practical issues Discourse-Oriented Corpus Studies View project Conditionals and Modality View project. CADS, https://www.researchgate.net/publication/261708842 [dostęp: 10.05.2024].

Gillings Mathew, Mautner Gerlinde, Baker Paul (2023), Corpus-Assisted Discourse Studies, Cambridge: Cambridge University Press, https://doi.org/10.1017/9781009168144 DOI: https://doi.org/10.1017/9781009168144

Hardt-Mautner Gerlinde (1995), „Only Connect.” Critical Discourse Analysis and Corpus Linguistics, „UCREL Technical Paper”, no. 6.

Heidenreich Tobias, Lind Fabienne, Eberl Jakob-Moritz, Boomgaarden Hajo G. (2019), Media Framing Dynamics of the „European Refugee Crisis”: A Comparative Topic Modelling Approach, „Journal of Refugee Studies”, vol. 32(1), s. i172–i182, https://doi.org/10.1093/jrs/fez025 DOI: https://doi.org/10.1093/jrs/fez025

Hunston Susan (2002), Corpora in Applied Linguistics, Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139524773

Isoaho Karoliina, Gritsenko Daria, Mäkelä Eetu (2021), Topic Modeling and Text Analysis for Qualitative Policy Research, „Policy Studies Journal”, vol. 49(1), s. 300–324, https://doi.org/10.1111/psj.12343 DOI: https://doi.org/10.1111/psj.12343

Jakubíček Milos, Kilgarriff Adam, Kovář Vojtech, Rychlý Pavel (2013), The TenTen Corpus Family, [w:] 7th International Corpus Linguistics Conference CL, s. 125–127, https://www.sketchengine.eu/wp-content/uploads/The_TenTen_Corpus_2013.pdf [dostęp 10.05.2024].

Kieraś Witold, Kobyliński Łukasz (2021), Korpusomat – present state and the future of the project, „Jezyk Polski”, R. 101, z. 2, s. 49–58, https://doi.org/10.31286/JP.101.2.4 DOI: https://doi.org/10.31286/JP.101.2.4

Kieraś Witold, Kobyliński Łukasz, Ogrodniczuk Maciej (2018), Korpusomat – a Tool for Creating Searchable Morphosyntactically Tagged Corpora, „Computational Methods in Science and Technology”, vol. 24(1), s. 21–27, https://doi.org/10.12921/cmst.2018.0000005 DOI: https://doi.org/10.12921/cmst.2018.0000005

Kilgarriff Adam (2009), Simple maths for keywords, [w:] Proceedings of the Corpus Linguistics Conference. Liverpool, UK. 2009, https://www.sketchengine.eu/wp-content/uploads/2015/04/2009-Simple-maths-for-keywords.pdf [dostęp: 10.05.2024].

Kilgarriff Adam, Baisa Vit, Rychlý Pavel, Jakubíček Milos (2015), Longest-commonest Match, [w:] Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 conference, s. 397–404, https://www.sketchengine.eu/wp-content/uploads/Longest-commonest_eLex2015.pdf [dostęp: 10.05.2024]

Kilgarriff Adam, Reddy Siva, Pomikálek Jan, Pvs Avinesh (2010), A Corpus Factory for many languages, [in:] Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias (red.), Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Valletta: European Language Resources Association, s. 904–910, https://aclanthology.org/L10-1044/ [dostęp: 10.05.2024].

Kilgarriff Adam, Baisa Vit, Bušta Jan, Jakubícek Milos, Kovár Vojtech, Michelfeit Jan, Rychlý Pavel, Suchomel Vit (2014), The Sketch Engine: Ten years on, „Lexicography”, vol. 1(1), s. 7–36, https://doi.org/10.1007/s40607-014-0009-9 DOI: https://doi.org/10.1007/s40607-014-0009-9

Korpusomat (b.r.), https://korpusomat.pl/ [dostęp: 10.05.2024].

Korpusomat (Beta) (b.r.), https://korpusomat.eu/ [dostęp: 10.05.2024].

Krippendorff Klaus (2004), Content analysis. An Introduction to Its Methodology, Thousand Oaks–London–New Delhi: Sage Publications.

Krzyżanowski Michał, Forchtner Bernhard (2016), Theories and concepts in critical discourse studies: Facing challenges, moving beyond foundations, „Discourse & Society”, vol. 27(3), s. 253–261, https://doi.org/10.1177/0957926516630900 DOI: https://doi.org/10.1177/0957926516630900

Leech Geoffrey, Fallon Roger (1992), Computer corpora – What do they tell us about culture?, „ICAME Journal”, vol. 16, s. 29–50.

Matytcina Marina S., Grigoryanova Tatiana (2022), Statistical Methods for Extracting Collocations from a Text Corpus, [w:] 2022 2nd International Conference on Technology Enhanced Learning in Higher Education (TELE), Lipetsk: IEEE, s. 55–57, https://doi.org/10.1109/TELE55498.2022.9801038 DOI: https://doi.org/10.1109/TELE55498.2022.9801038

Piasecki Maciej (2007), Polish Tagger TaKIPI: Rule Based Construction and Optimisation, „Task Quarterly”, vol. 11(1–2), s. 151–167, https://www.researchgate.net/publication/272685698 [dostęp 10.05.2024].

Piasecki Maciej (2014), User-driven Language Technology Infrastructure -t he Case of CLARIN-PL, [w:] 9th Language Technologies Conference Information Society – IS 2014, s. 7–13, https://nl.ijs.si/isjt14/proceedings/isjt2014_01.pdf [dostęp: 10.05.2024].

Piper Alison (2000), Some People Have Credit Cards and Others Have Giro Cheques: “Individuals” and “People” as Lifelong Learners in Late Modernity, „Discourse and Society”, vol. 11(4), s. 515–542. DOI: https://doi.org/10.1177/0957926500011004004

Potts Amanda, Bednarek Monika, Caple Helen (2015), How can computer-based methods help researchers to investigate news values in large datasets? A corpus linguistic study of the construction of newsworthiness in the reporting on Hurricane Katrina, „Discourse and Communication”, vol. 9(2), s. 149–172, https://doi.org/10.1177/1750481314568548 DOI: https://doi.org/10.1177/1750481314568548

Przepiórkowski Adam (2009), A comparison of two morphosyntactic tagsets of Polish, [w:] Representing Semantics in Digital Lexicography: Proceedings of MONDILEX Fourth Open Workshop, s. 138–144, https://nlp.ipipan.waw.pl/~adamp/Papers/2009-mondilex/article.pdf [dostęp: 10.05.2024].

Radziszewski Adam, Kilgarriff Adam, Lew Robert (2011), Polish Word Sketches, https://www.sketchengine.eu/wp-content/uploads/Polish_Word_Sketches_2011.pdf [dostęp: 10.05.2024].

Rychlý Pavel (2008), A Lexicographer-Friendly Association Score, [w:] Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2008, s. 6–9, https://nlp.fi.muni.cz/raslan/2008/papers/13.pdf [dostęp: 10.05.2024]

Saputa Karol, Tomaszewska Aleksandra, Zawadzka-Paluektau Natalia, Kieraś Witold, Kobyliński Łukasz (2023), Korpusomat.eu: A Multilingual Platform for Building and Analysing Linguistic Corpora, [w:] Jiří Mikyška, Clélia de Mulatier, Maciej Paszynski, Valeria V. Krzhizhanovskaya, Jack J. Dongarra, Peter M.A. Sloot (red.), Computational Science – ICCS 2023. 23rd International Conference, Prague, Czech Republic, July 3–5, 2023, Proceedings, Part II, s. 230–237, https://nlp.ipipan.waw.pl/Bib/sap:etal:23:iccs.pdf [dostęp: 10.05.2024]. DOI: https://doi.org/10.1007/978-3-031-36021-3_22

Savicky Petr, Hlavacova Jaroslava (2002), Measures of word commonness, „Journal of Quantitative Linguistics”, vol. 9, s. 215–231. DOI: https://doi.org/10.1076/jqul.9.3.215.14124

Scott Mike (1997), PC analysis of key words – and key key words, „System”, vol. 25(2), s. 233–245. DOI: https://doi.org/10.1016/S0346-251X(97)00011-0

Scott Mike (2011), WordSmith Tools Manual, Version 6, Stroud: Lexical Analysis Software Ltd., https://lexically.net/downloads/version6/wordsmith6.pdf [dostęp: 10.05.2024].

SketchEngine (b.r.), https://www.sketchengine.eu/ [dostęp: 10.05.2024].

Stubbs Michael (1997), Whorf’s Children: Critical Comments on Critical Discourse Analysis (CDA), [w:] Ann Ryan, Alison Wray (red.), Evolving Models of Language, Clavendon: Multilingual Matters, s. 100–116.

Törnberg Anton, Törnberg Petter (2016), Combining CDA and topic modeling: Analyzing discursive connections between Islamophobia and anti-feminism on an online forum, „Discourse & Society”, vol. 27(4), s. 401–422, https://doi.org/10.1177/0957926516634546 DOI: https://doi.org/10.1177/0957926516634546

Zawadzka-Paluektau Natalia (2023), Ukrainian refugees in Polish press, „Discourse and Communication”, vol. 17(1), s. 96–111, https://doi.org/10.1177/17504813221111636 DOI: https://doi.org/10.1177/17504813221111636

Badania dyskursu wspomagane korpusowo (CADS) jako wsparcie jakościowej analizy treści. Studium przypadku wykorzystania programu SketchEngine w badaniach dyskursu

Autor

DOI:

Słowa kluczowe:

Abstrakt

Biogram autora

Bibliografia

Pobrania

Opublikowane

Wersje

Numer

Dział

Licencja

Jak cytować

Inne teksty tego samego autora

Język / Language

cope

men

mnisw

similaritycheck

Słowa kluczowe

sjr

citescore

Latest publications