O „subtelnościach” metod oceny wydźwięku wypowiedzi pisanych. Porównanie trzech podejść w analizie sentymentu

Krzysztof Tomanek

doi:10.18778/1733-8069.20.4.04

Authors

Krzysztof Tomanek Uniwersytet Jagielloński https://orcid.org/0000-0003-1789-0006

DOI:

https://doi.org/10.18778/1733-8069.20.4.04

Keywords:

NLP, ML, Artificial Intelligence, Sentiment analysis, Sentiment Dictionary, Qualitative analysis

Abstract

The discussion presents the results of a methodological experiment in which three methods – different in their logic and application – of analyzing statements written in text form were used for the same research material. The purpose of this research paper is to indicate the differences of the three analytical approaches, among which we are dealing with analysis based on comprehensible reading of the text (manual coding), semi-automatic and supervised analysis (performed by a classification dictionary programed by a human and based on transparent rules – a method from the field of machine learning – ML), and a non-transparent and unsupervized method (artificial intelligence – in this role Chat GPT version 3.5). The study deals with sentiment analysis. Attention is largely devoted to the application of these methods and to explaining the differences in the obtained results.

Downloads

Download data is not yet available.

Author Biography

Krzysztof Tomanek, Uniwersytet Jagielloński

Socjolog, doktor nauk społecznych, reprezentuje Instytut Socjologii Uniwersytetu Jagiellońskiego. Współzałożyciel CAQDAS TM Lab przy Instytucie Socjologii na Uniwersytecie Jagiellońskim. Zajmuje się głównie zastosowaniem metod służących analizom danych jakościowych i ilościowych, w tym także zastosowaniem uczenia maszynowego i AI w naukach społecznych. Interesuje się również i na co dzień zajmuje metodami wizualizacji danych, storytellingiem, teorią sieciową w badaniach społecznych. Od siedmiu lat analizuje projekty artystyczne Rity Leistner. Zaangażowany społecznie i woluntarystycznie. Członek Stowarzyszenia NGO POLITES, PTS, PTE.

References

Baccianella Stefano, Esuli Andrea, Sebastiani Fabrizio (2010), SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining, [w:] Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, 17–23 May 2010, Valletta, Malta, http://nmis.isti.cnr.it/sebastiani/Publications/LREC10.pdf [dostęp: 1.04.2023].
Google Scholar

Barrett Lisa, Adolphs Ralph, Marsella Stacy, Martinez Aleix, Pollak Seth (2019), Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements, „Psychological Science in the Public Interest”, vol. 20(1).
Google Scholar DOI: https://doi.org/10.1177/1529100619832930

Bernard Russel, Wutich Amber, Ryan Gery (2017), Analyzing Qualitative Data. Systematic Approach, Thousand Oaks: Sage Publications.
Google Scholar

Bryant Anthony, Charmaz Kathy (2007), The SAGE Handbook of Grounded Theory, London: Sage Publications, https://doi.org/10.4135/9781848607941
Google Scholar DOI: https://doi.org/10.4135/9781848607941

Elouazizi Noureddine, Oberg Gunilla, Birol Gulnur (2017), Learning technology-enabled (meta)-cognitive scaffolding to support learning aspects of written argumentation, https://ceur-ws.org/Vol-2141/paper2.pdf [dostęp 17.09.2024].
Google Scholar DOI: https://doi.org/10.1145/3027385.3029484

Esuli Andrea, Sebastiani Fabrizio (2006), SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining, https://www.researchgate.net/publication/200044289_SentiWordNet_A_Publicly_Available_Lexical_Resource_for_Opinion_Mining [dostęp: 1.04.2023].
Google Scholar

Fargues Melanie, Kadry Seifedine, Lawal Isah A., Yassine Sahar, Rauf Hafiz Tayyab (2023), Automated Analysis of Open-Ended Students’ Feedback Using Sentiment, Emotion, and Cognition Classifications, „Applied Science”, vol. 13(4), 2061, https://doi.org/10.3390/app13042061
Google Scholar DOI: https://doi.org/10.3390/app13042061

Faulkner Sandra, Trotter Stormy (2017), Theoretical Saturation, [w:] The International Encyclopedia of Communication Research Methods, https://doi.org/10.1002/9781118901731.iecrm0250
Google Scholar DOI: https://doi.org/10.1002/9781118901731.iecrm0250

Fromm Davida, MacWhinney Brian, Thompson Cynthia (2020), Automation of the Northwestern Narrative Language Analysis System, „Journal of Speech, Language, and Hearing Research”, vol. 63(6), s. 1835–1844.
Google Scholar DOI: https://doi.org/10.1044/2020_JSLHR-19-00267

Glaser Barney, Strauss Anselm (1967), The Discovery of Grounded Theory, New Brunswick–London: Aldine Transaction, A Division of Transaction Publishers, http://www.sxf.uevora.pt/wp-content/uploads/2013/03/Glaser_1967.pdf [dostęp: 1.04.2023].
Google Scholar

Guest Greg, Bunce Arwen, Johnson Laura (2006), How Many Interviews Are Enough? An Experiment with Data Saturation and Variability, „Field Methods”, vol. 18(1), s. 59–82, https://doi.org/10.1177/1525822X05279903
Google Scholar DOI: https://doi.org/10.1177/1525822X05279903

Hemalatha Indukuri, Varma Gottumukkala Pardha Saradhi, Govardhan Aliseri (2014), Automated Sentiment Analysis System Using Machine Learning Algorithms, „International Journal of Research in Computer and Communication Technology”, vol. 3(3), s. 300–303.
Google Scholar

Hewitt John, Manning Christopher D. (2019), A structural probe for finding syntax in word representations, [w:] Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, Minneapolis: Association for Computational Linguistics, s. 4129–4138.
Google Scholar

Hsu Chien-Ju, Thompson Cynthia (2018), Manual Versus Automated Narrative Analysis of Agrammatic Production Patterns: The Northwestern Narrative Language Analysis and Computerized Language Analysis, „Journal of Speech, Language, and Hearing Research”, vol. 61(2), s. 373–385.
Google Scholar DOI: https://doi.org/10.1044/2017_JSLHR-L-17-0185

Hutto Clayton, Gilbert Eric (2014), VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text, [w:] Eytan Adar, Paul Resnick (red.), Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media, vol. 8(1), Ann Arbor: University of Michigan, PKP Publishing Services Network, s. 216–225, https://doi.org/10.1609/icwsm.v8i1.14550
Google Scholar DOI: https://doi.org/10.1609/icwsm.v8i1.14550

Keiser Gabriele, Presmeg Norma (red.) (2019), Compendium for Early Career Researchers in Mathematics Education, https://link.springer.com/book/10.1007/978-3-030-15636-7 [dostęp: 1.04.2023].
Google Scholar DOI: https://doi.org/10.1007/978-3-030-15636-7

Kocoń Jan, Janz Arkadiusz, Piasecki Maciej (2018), Context-sensitive sentiment propagation in WordNet, [w:] Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18), Singapore: Global Wordnet Association, Nanyang Technological University (NTU), s. 333–338.
Google Scholar

Kocoń Jan, Miłkowski Piotr, Zaśko-Zielińska Monika (2019), Multi-Level Sentiment Analysis of PolEmo 2.0: Extended Corpus of Multi-Domain Consumer Reviews, [w:] Proceedings of the 23rd Conference on Computational Natural Language Learning, Hong Kong: Association for Computational Linguistics, s. 980–991.
Google Scholar DOI: https://doi.org/10.18653/v1/K19-1092

Lake Brenden M., Baroni Marco (2018), Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks, [w:] Proceedings of the 35th International Conference on Machine Learning, Volume 80 of Proceedings of Machine Learning Research (Stockholm), Ithaca: Cornell University Library, s. 2873–2882.
Google Scholar

Liontou Trisevgeni (2022), Automated Discourse Analysis Techniques and Implications for Writing Assessment, „Languages”, vol. 8(1), 3.
Google Scholar DOI: https://doi.org/10.3390/languages8010003

Liu Bing (2015), Sentiment analysis: Mining opinions, sentiments, and emotions, Cambridge: MIT Press.
Google Scholar DOI: https://doi.org/10.1017/CBO9781139084789

Lula Paweł, Wójcik Katarzyna, Tuchowski Janusz (2016), Analiza wydźwięku polskojęzycznych opinii konsumenckich ukierunkowanych na cechy produktu, „Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu, Taksonomia 27”, vol. 427, s. 153–164, https://www.dbc.wroc.pl/Content/33161/Lula_Analiza_Wydzwieku_Polskojezycznych_Opinii_Konsumenckich_2016.pdf [dostęp: 10.05.2024].
Google Scholar DOI: https://doi.org/10.15611/pn.2016.427.16

Munnes Stefan, Harsch Corinna, Knobloch Marcel, Vogel Johannes S., Hipp Lena, Schilling Erik (2022), Examining Sentiment in Complex Texts. A Comparison of Different Computational Approaches, „Frontiers in Big Data”, vol. 5, 886362, https://doi.org/10.3389/fdata.2022.886362
Google Scholar DOI: https://doi.org/10.3389/fdata.2022.886362

Németh Renáta, Koltai Júlia (2021), The Potential of Automated Text Analytics in Social Knowledge Building, [w:] Tamás Rudas, Gábor Péli (red.), Pathways Between Social Science and Computational Social Science, Cham: Springer, s. 49–70.
Google Scholar DOI: https://doi.org/10.1007/978-3-030-54936-7_3

OpenAI (b.r.), Introducing ChatGPT, https://openai.com/blog/chatgpt/ [dostęp: 25.02.2023].
Google Scholar

OpenAI Platform (b.r.), Prompt examples, https://platform.openai.com/examples [dostęp: 4.04.2023].
Google Scholar

Oracle Polska (b.r.), Czym jest chatbot?, https://www.oracle.com/pl/chatbots/what-is-a-chatbot/ [dostęp: 25.02.2023].
Google Scholar

Ravichander Abhilasha, Hovy Eduard, Suleman Kaheer, Trischler Adam, Cheung Jackie Chi Kit (2020), On the systematicity of probing contextualized word representations: The case of hypernymy in BERT, [w:] Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics, Barcelona: Association for Computational Linguistic, s. 88–102.
Google Scholar

Regneri Michaela, King Diane (2016), Automated Discourse Analysis of Narrations by Adolescents with Autistic Spectrum Disorder, [w:] Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, Berlin: Association for Computational Linguistics, s. 1–9.
Google Scholar DOI: https://doi.org/10.18653/v1/W16-1901

Rogers Beth, Knafl Kathleen (2000), Concept analysis: An evolutionary view, [w:] Beth Rogers, Kathleen Knafl (red.), Concept Development in Nursing: Foundations, Techniques and Applications, Philadelphia: W.-B. Saunders Company, s. 77–102.
Google Scholar

Saunders Benjamin, Sim Julius, Kingstone Tom, Baker Shula, Waterfield Jackie, Bartlam Bernadette, Burroughs Heather, Jinks Clare (2018), Saturation in qualitative research: exploring its conceptualization and operationalization, „Quality & Quantity”, vol. 52, s. 1893–1907, https://doi.org/10.1007/s11135-017-0574-8
Google Scholar DOI: https://doi.org/10.1007/s11135-017-0574-8

Strauss Anselm, Corbin Juliet (1998), Basics of qualitative research: Techniques and procedures for developing grounded theory, Thousand Oaks: Sage Publications.
Google Scholar

Tatarkiewicz Władysław (2005), Historia filozofii, Warszawa: Wydawnictwo Naukowe PWN.
Google Scholar

Tomanek Krzysztof (2014a), Analiza sentymentu – metoda analizy danych jakościowych. Przykład zastosowania oraz ewaluacja słownika RID i metody klasyfikacji Bayesa w analizie danych jakościowych, „Przegląd Socjologii Jakościowej”, t. X, nr 2, s. 118–136.
Google Scholar DOI: https://doi.org/10.18778/1733-8069.10.2.07

Tomanek Krzysztof (2014b), Jak nauczyć metodę samodzielności? O „samouczących się” metodach analizy treści, [w:] Jakub Niedbalski (red.), Metody i techniki odkrywania wiedzy. Narzędzia CAQDAS w procesie analizy danych jakościowych, Łódź: Wydawnictwo Uniwersytetu Łódzkiego, s. 173–189.
Google Scholar DOI: https://doi.org/10.18778/7969-549-2.09

Tomanek Krzysztof, Bryda Grzegorz (2014), Odkrywanie wiedzy w wypowiedziach tekstowych. Metoda budowy słownika klasyfikacyjnego, [w:] Jakub Niedbalski (red.), Metody i techniki odkrywania wiedzy. Narzędzia CAQDAS w procesie analizy danych jakościowych, Łódź: Wydawnictwo Uniwersytetu Łódzkiego, s. 219–248.
Google Scholar DOI: https://doi.org/10.18778/7969-549-2.11

Tomanek Krzysztof, Bryda Grzegorz (2015), Odkrywanie postaw dydaktyków zawartych w komentarzach studenckich. Analiza treści z zastosowaniem słownika klasyfikacyjnego, „Przegląd Socjologiczny”, t. LXIV(4), s. 51–81.
Google Scholar

Williams Michael, Moser Tami (2019), The Art of Coding and Thematic Exploration in Qualitative Research, „International Management Review”, vol. 15(1), s. 45–55.
Google Scholar

Wyżga Patrycjusz (2023), Dragan o sztucznej inteligencji: Będzie po nas. Nie ma pomyślnego scenariusza, https://wiadomosci.wp.pl/dragan-o-sztucznej-inteligencji-bedzie-po-nas-nie-ma-pomyslnego-scenariusza-6889788022762080a [dostęp: 1.05.2024].
Google Scholar

Yao Jiawei (2019), Automated Sentiment Analysis of Text Data with NLTK, „Journal of Physics: Conference Series”, vol. 1187, 052020.
Google Scholar DOI: https://doi.org/10.1088/1742-6596/1187/5/052020

Yilmaz Begum (2023), Sentiment Analysis Methods in 2023: Overview, Pros & Cons, https://research.aimultiple.com/sentiment-analysis-methods/ [dostęp: 1.04.2023].
Google Scholar