O „subtelnościach” metod oceny wydźwięku wypowiedzi pisanych. Porównanie trzech podejść w analizie sentymentu

Krzysztof Tomanek

doi:10.18778/1733-8069.20.4.04

Autor

Krzysztof Tomanek Uniwersytet Jagielloński https://orcid.org/0000-0003-1789-0006

DOI:

https://doi.org/10.18778/1733-8069.20.4.04

Słowa kluczowe:

NLP, ML, sztuczna inteligencja, analiza sentymentu, słowniki sentymentu, analizy jakościowe

Abstrakt

Artykuł przedstawia wyniki eksperymentu metodologicznego, w którym w odniesieniu do tego samego materiału badawczego posłużono się trzema odmiennymi w swojej logice i zastosowaniu metodami analizy wypowiedzi zapisanych w formie tekstowej. Celem tego opracowania jest wskazanie różnic trzech podejść analitycznych, w których mamy do czynienia z analizą opartą na rozumiejącym czytaniu tekstu (kodowanie manualne), analizą półautomatyczną i nadzorowaną (wykonaną przez słownik klasyfikacyjny zaprogramowany przez człowieka i oparty na transparentnych regułach – metoda z obszaru machine learning – ML) oraz metodą nietransparentną i nienadzorowaną (sztuczna inteligencja – ChatGPT w wersji 3.5). Badanie dotyczy analizy sentymentu, zwanej też analizą wydźwięku. Uwaga w dużej mierze skoncentrowana jest na zastosowaniu tych metod oraz wyjaśnieniu różnic w uzyskanych wynikach.

Biogram autora

Krzysztof Tomanek - Uniwersytet Jagielloński

Socjolog, doktor nauk społecznych, reprezentuje Instytut Socjologii Uniwersytetu Jagiellońskiego. Współzałożyciel CAQDAS TM Lab przy Instytucie Socjologii na Uniwersytecie Jagiellońskim. Zajmuje się głównie zastosowaniem metod służących analizom danych jakościowych i ilościowych, w tym także zastosowaniem uczenia maszynowego i AI w naukach społecznych. Interesuje się również i na co dzień zajmuje metodami wizualizacji danych, storytellingiem, teorią sieciową w badaniach społecznych. Od siedmiu lat analizuje projekty artystyczne Rity Leistner. Zaangażowany społecznie i woluntarystycznie. Członek Stowarzyszenia NGO POLITES, PTS, PTE.

Bibliografia

Baccianella Stefano, Esuli Andrea, Sebastiani Fabrizio (2010), SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining, [w:] Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, 17–23 May 2010, Valletta, Malta, http://nmis.isti.cnr.it/sebastiani/Publications/LREC10.pdf [dostęp: 1.04.2023].

Barrett Lisa, Adolphs Ralph, Marsella Stacy, Martinez Aleix, Pollak Seth (2019), Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements, „Psychological Science in the Public Interest”, vol. 20(1). DOI: https://doi.org/10.1177/1529100619832930

Bernard Russel, Wutich Amber, Ryan Gery (2017), Analyzing Qualitative Data. Systematic Approach, Thousand Oaks: Sage Publications.

Bryant Anthony, Charmaz Kathy (2007), The SAGE Handbook of Grounded Theory, London: Sage Publications, https://doi.org/10.4135/9781848607941 DOI: https://doi.org/10.4135/9781848607941

Elouazizi Noureddine, Oberg Gunilla, Birol Gulnur (2017), Learning technology-enabled (meta)-cognitive scaffolding to support learning aspects of written argumentation, https://ceur-ws.org/Vol-2141/paper2.pdf [dostęp 17.09.2024]. DOI: https://doi.org/10.1145/3027385.3029484

Esuli Andrea, Sebastiani Fabrizio (2006), SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining, https://www.researchgate.net/publication/200044289_SentiWordNet_A_Publicly_Available_Lexical_Resource_for_Opinion_Mining [dostęp: 1.04.2023].

Fargues Melanie, Kadry Seifedine, Lawal Isah A., Yassine Sahar, Rauf Hafiz Tayyab (2023), Automated Analysis of Open-Ended Students’ Feedback Using Sentiment, Emotion, and Cognition Classifications, „Applied Science”, vol. 13(4), 2061, https://doi.org/10.3390/app13042061 DOI: https://doi.org/10.3390/app13042061

Faulkner Sandra, Trotter Stormy (2017), Theoretical Saturation, [w:] The International Encyclopedia of Communication Research Methods, https://doi.org/10.1002/9781118901731.iecrm0250 DOI: https://doi.org/10.1002/9781118901731.iecrm0250

Fromm Davida, MacWhinney Brian, Thompson Cynthia (2020), Automation of the Northwestern Narrative Language Analysis System, „Journal of Speech, Language, and Hearing Research”, vol. 63(6), s. 1835–1844. DOI: https://doi.org/10.1044/2020_JSLHR-19-00267

Glaser Barney, Strauss Anselm (1967), The Discovery of Grounded Theory, New Brunswick–London: Aldine Transaction, A Division of Transaction Publishers, http://www.sxf.uevora.pt/wp-content/uploads/2013/03/Glaser_1967.pdf [dostęp: 1.04.2023].

Guest Greg, Bunce Arwen, Johnson Laura (2006), How Many Interviews Are Enough? An Experiment with Data Saturation and Variability, „Field Methods”, vol. 18(1), s. 59–82, https://doi.org/10.1177/1525822X05279903 DOI: https://doi.org/10.1177/1525822X05279903

Hemalatha Indukuri, Varma Gottumukkala Pardha Saradhi, Govardhan Aliseri (2014), Automated Sentiment Analysis System Using Machine Learning Algorithms, „International Journal of Research in Computer and Communication Technology”, vol. 3(3), s. 300–303.

Hewitt John, Manning Christopher D. (2019), A structural probe for finding syntax in word representations, [w:] Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, Minneapolis: Association for Computational Linguistics, s. 4129–4138.

Hsu Chien-Ju, Thompson Cynthia (2018), Manual Versus Automated Narrative Analysis of Agrammatic Production Patterns: The Northwestern Narrative Language Analysis and Computerized Language Analysis, „Journal of Speech, Language, and Hearing Research”, vol. 61(2), s. 373–385. DOI: https://doi.org/10.1044/2017_JSLHR-L-17-0185

Hutto Clayton, Gilbert Eric (2014), VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text, [w:] Eytan Adar, Paul Resnick (red.), Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media, vol. 8(1), Ann Arbor: University of Michigan, PKP Publishing Services Network, s. 216–225, https://doi.org/10.1609/icwsm.v8i1.14550 DOI: https://doi.org/10.1609/icwsm.v8i1.14550

Keiser Gabriele, Presmeg Norma (red.) (2019), Compendium for Early Career Researchers in Mathematics Education, https://link.springer.com/book/10.1007/978-3-030-15636-7 [dostęp: 1.04.2023]. DOI: https://doi.org/10.1007/978-3-030-15636-7

Kocoń Jan, Janz Arkadiusz, Piasecki Maciej (2018), Context-sensitive sentiment propagation in WordNet, [w:] Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18), Singapore: Global Wordnet Association, Nanyang Technological University (NTU), s. 333–338.

Kocoń Jan, Miłkowski Piotr, Zaśko-Zielińska Monika (2019), Multi-Level Sentiment Analysis of PolEmo 2.0: Extended Corpus of Multi-Domain Consumer Reviews, [w:] Proceedings of the 23rd Conference on Computational Natural Language Learning, Hong Kong: Association for Computational Linguistics, s. 980–991. DOI: https://doi.org/10.18653/v1/K19-1092

Lake Brenden M., Baroni Marco (2018), Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks, [w:] Proceedings of the 35th International Conference on Machine Learning, Volume 80 of Proceedings of Machine Learning Research (Stockholm), Ithaca: Cornell University Library, s. 2873–2882.

Liontou Trisevgeni (2022), Automated Discourse Analysis Techniques and Implications for Writing Assessment, „Languages”, vol. 8(1), 3. DOI: https://doi.org/10.3390/languages8010003

Liu Bing (2015), Sentiment analysis: Mining opinions, sentiments, and emotions, Cambridge: MIT Press. DOI: https://doi.org/10.1017/CBO9781139084789

Lula Paweł, Wójcik Katarzyna, Tuchowski Janusz (2016), Analiza wydźwięku polskojęzycznych opinii konsumenckich ukierunkowanych na cechy produktu, „Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu, Taksonomia 27”, vol. 427, s. 153–164, https://www.dbc.wroc.pl/Content/33161/Lula_Analiza_Wydzwieku_Polskojezycznych_Opinii_Konsumenckich_2016.pdf [dostęp: 10.05.2024]. DOI: https://doi.org/10.15611/pn.2016.427.16

Munnes Stefan, Harsch Corinna, Knobloch Marcel, Vogel Johannes S., Hipp Lena, Schilling Erik (2022), Examining Sentiment in Complex Texts. A Comparison of Different Computational Approaches, „Frontiers in Big Data”, vol. 5, 886362, https://doi.org/10.3389/fdata.2022.886362 DOI: https://doi.org/10.3389/fdata.2022.886362

Németh Renáta, Koltai Júlia (2021), The Potential of Automated Text Analytics in Social Knowledge Building, [w:] Tamás Rudas, Gábor Péli (red.), Pathways Between Social Science and Computational Social Science, Cham: Springer, s. 49–70. DOI: https://doi.org/10.1007/978-3-030-54936-7_3

OpenAI (b.r.), Introducing ChatGPT, https://openai.com/blog/chatgpt/ [dostęp: 25.02.2023].

OpenAI Platform (b.r.), Prompt examples, https://platform.openai.com/examples [dostęp: 4.04.2023].

Oracle Polska (b.r.), Czym jest chatbot?, https://www.oracle.com/pl/chatbots/what-is-a-chatbot/ [dostęp: 25.02.2023].

Ravichander Abhilasha, Hovy Eduard, Suleman Kaheer, Trischler Adam, Cheung Jackie Chi Kit (2020), On the systematicity of probing contextualized word representations: The case of hypernymy in BERT, [w:] Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics, Barcelona: Association for Computational Linguistic, s. 88–102.

Regneri Michaela, King Diane (2016), Automated Discourse Analysis of Narrations by Adolescents with Autistic Spectrum Disorder, [w:] Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, Berlin: Association for Computational Linguistics, s. 1–9. DOI: https://doi.org/10.18653/v1/W16-1901

Rogers Beth, Knafl Kathleen (2000), Concept analysis: An evolutionary view, [w:] Beth Rogers, Kathleen Knafl (red.), Concept Development in Nursing: Foundations, Techniques and Applications, Philadelphia: W.-B. Saunders Company, s. 77–102.

Saunders Benjamin, Sim Julius, Kingstone Tom, Baker Shula, Waterfield Jackie, Bartlam Bernadette, Burroughs Heather, Jinks Clare (2018), Saturation in qualitative research: exploring its conceptualization and operationalization, „Quality & Quantity”, vol. 52, s. 1893–1907, https://doi.org/10.1007/s11135-017-0574-8 DOI: https://doi.org/10.1007/s11135-017-0574-8

Strauss Anselm, Corbin Juliet (1998), Basics of qualitative research: Techniques and procedures for developing grounded theory, Thousand Oaks: Sage Publications.

Tatarkiewicz Władysław (2005), Historia filozofii, Warszawa: Wydawnictwo Naukowe PWN.

Tomanek Krzysztof (2014a), Analiza sentymentu – metoda analizy danych jakościowych. Przykład zastosowania oraz ewaluacja słownika RID i metody klasyfikacji Bayesa w analizie danych jakościowych, „Przegląd Socjologii Jakościowej”, t. X, nr 2, s. 118–136. DOI: https://doi.org/10.18778/1733-8069.10.2.07

Tomanek Krzysztof (2014b), Jak nauczyć metodę samodzielności? O „samouczących się” metodach analizy treści, [w:] Jakub Niedbalski (red.), Metody i techniki odkrywania wiedzy. Narzędzia CAQDAS w procesie analizy danych jakościowych, Łódź: Wydawnictwo Uniwersytetu Łódzkiego, s. 173–189. DOI: https://doi.org/10.18778/7969-549-2.09

Tomanek Krzysztof, Bryda Grzegorz (2014), Odkrywanie wiedzy w wypowiedziach tekstowych. Metoda budowy słownika klasyfikacyjnego, [w:] Jakub Niedbalski (red.), Metody i techniki odkrywania wiedzy. Narzędzia CAQDAS w procesie analizy danych jakościowych, Łódź: Wydawnictwo Uniwersytetu Łódzkiego, s. 219–248. DOI: https://doi.org/10.18778/7969-549-2.11

Tomanek Krzysztof, Bryda Grzegorz (2015), Odkrywanie postaw dydaktyków zawartych w komentarzach studenckich. Analiza treści z zastosowaniem słownika klasyfikacyjnego, „Przegląd Socjologiczny”, t. LXIV(4), s. 51–81.

Williams Michael, Moser Tami (2019), The Art of Coding and Thematic Exploration in Qualitative Research, „International Management Review”, vol. 15(1), s. 45–55.

Wyżga Patrycjusz (2023), Dragan o sztucznej inteligencji: Będzie po nas. Nie ma pomyślnego scenariusza, https://wiadomosci.wp.pl/dragan-o-sztucznej-inteligencji-bedzie-po-nas-nie-ma-pomyslnego-scenariusza-6889788022762080a [dostęp: 1.05.2024].

Yao Jiawei (2019), Automated Sentiment Analysis of Text Data with NLTK, „Journal of Physics: Conference Series”, vol. 1187, 052020. DOI: https://doi.org/10.1088/1742-6596/1187/5/052020

Yilmaz Begum (2023), Sentiment Analysis Methods in 2023: Overview, Pros & Cons, https://research.aimultiple.com/sentiment-analysis-methods/ [dostęp: 1.04.2023].