"Sentiment Analysis". An Example of Application and Evaluation of RID Dictionary and Bayesian Classification Methods in Qualitative Data Analysis Approach
DOI:
https://doi.org/10.18778/1733-8069.10.2.07Keywords:
qualitative data analysis, sentiment analysis, content analysis, text mining, coding techniques, atural language processing, RID dictionary, naive Bayes, CAQDASAbstract
The purpose of this article is to present the basic methods for classifying text data. These methods make use of achievements earned in areas such as: natural language processing, the analysis of unstructured data. I introduce and compare two analytical techniques applied to text data. The first analysis makes use of thematic vocabulary tool (sentiment analysis). The second technique uses the idea of Bayesian classification and applies, so-called, naive Bayes algorithm. My comparison goes towards grading the efficiency of use of these two analytical techniques. I emphasize solutions that are to be used to build dictionary accurate for the task of text classification. Then, I compare supervised classification to automated unsupervised analysis’ effectiveness. These results reinforce the conclusion that a dictionary which has received good evaluation as a tool for classification should be subjected to review and modification procedures if is to be applied to new empirical material. Adaptation procedures used for analytical dictionary become, in my proposed approach, the basic step in the methodology of textual data analysis.
Downloads
References
Acerbi Alberto i in. (2013) The Expression of Emotions in 20th Century Books. „PLoS ONE”, vol. 8, no. 3, s. 1–6 [dostęp 1 maja 2014 r.]. Dostępny w Internecie http://www.plosone.org/article/fetchObject.action?uri=info%3Adoi%2F10.1371%2Fjournal.pone.0059030&representation=PDF
Google Scholar
DOI: https://doi.org/10.1371/journal.pone.0059030
Cardie Claire i in. (2003) Combining low-level and summary representations of opinions for multi-perspective question answering [w:] Proceedings of the AAAI Spring Symposium on New Directions in Question Answering, s. 20–27 [dostęp 1 maja 2014 r.]. Dostępny w Internecie http://www.aaai.org/Papers/Symposia/Spring/2003/SS-03-07/SS03-07-004.pdf
Google Scholar
Das Sanjiv R., Chen Mike J. (2001) Yahoo! for Amazon: Sentiment Extraction fromSmall Talk on the Web,„Management Science”, Vol. 53, No. 9, s. 1375–1388 [dostęp 1 maja 2014 r.]. Dostępny w Internecie http://algo.scu.edu/~sanjivdas/chat_FINAL.pdf
Google Scholar
DOI: https://doi.org/10.1287/mnsc.1070.0704
Dave Kushal, Lawrence Steve, Pennock David M. (2003) Mining the peanut gallery: Opinion extraction and semantic classification of product reviews [w:] Proceedings of WWW, s. 519–528, [dostęp 1 maja 2014 r.]. Dostępny w Internecie http://www.kushaldave.com/p451-dave.pdf
Google Scholar
DOI: https://doi.org/10.1145/775152.775226
DeWall Nathan C. i in. (2011) Tuning in to psychological change: Linguistic markers of psychological traits and emotions over time in popular U.S. song lyrics. „Psychology of Aesthetics, Creativity, and the Arts”, vol. 5, no. 3, s. 200–207.
Google Scholar
DOI: https://doi.org/10.1037/a0023195
Dini Luca, Mazzini Giampaolo (2002) Opinion classification through information extraction [w:] Proceedings of the Conference on Data Mining Methods and Databases for Engineering, Finance and Other Fields (Data Mining), s. 299–310 [dostęp 1 maja 2014 r.]. Dostępny w Internecie http://www.google.pl/url?sa=t&rct=j-&q=&esrc=s&source=web&cd=1&ved=0CC8QFjAA&url=http%3A%2F%2Fia2010primercuat.googlecode.com%2Fsvn-history%2Fr45%2Ftrunk%2FSEI-GO%2Fdocs%2F10.1.1.109.1736.pdf&ei=D6diU9ahG8ep7AbGu4GYDQ&usg=AFQjCNGlzrqDMZ3aj-M_a-Yv4ITbwdU0KQ&bvm=bv.65788261,d.ZGU&cad=rja
Google Scholar
Domingos Pedro, Pazzani Michael (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning”, vol. 29, no. 2/3,s.103–130.
Google Scholar
DOI: https://doi.org/10.1023/A:1007413511361
Esuli Andrea, Sebastiani Fabrizio (2006) SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining [w:] Proceedings of the 5th Conference on Language Resources and Evaluation, LREC’06, s. 417–422 [dostęp 1 maja 2014]. Dostępny w Internecie http://gandalf.aksis.uib.no/lrec2006/pdf/384_pdf.pdf
Google Scholar
Hogenraad Robert, Orianne Emilie (1986) Imagery, regressive thinking, and verbal performance in internal monologue. „Imagination, Cognition, and Personality”, vol. 5, no. 2, s. 127–145.
Google Scholar
DOI: https://doi.org/10.2190/8DB8-ELNU-FCDY-ENMR
Hopkins Daniel, King Gary (2010) Extracting systematic social science meaning from text. „American Journal of Political Science”, vol. 54, no. 1, s. 229–247.
Google Scholar
DOI: https://doi.org/10.1111/j.1540-5907.2009.00428.x
Hotho Andreas, Nürnberger Andreas, Paaß Gerhard (2005) ABrief Survey of Text Mining. „German Journal for Computer Linguistics and Speech Technology”, vol. 20, no. 1, s. 19–62.
Google Scholar
Jurafsky Dan, Martin James H. (2009) Speech and natural language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River, NJ: Prentice Hall.
Google Scholar
Lieberman Erez i in. (2007) Quantifying the evolutionary dynamics of language. „Nature”, vol. 449, no. 7163, s. 713–716.
Google Scholar
DOI: https://doi.org/10.1038/nature06137
Loughran Tim, McDonald Bill (2011) When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks. „The Journal of Finance”, vol. 66, no. 1, s. 35–65.
Google Scholar
DOI: https://doi.org/10.1111/j.1540-6261.2010.01625.x
Martindale Colin (1976) Primitive mentality and the relationship between art and society. „Scientific Aesthetics”, vol. 1, s. 5–18.
Google Scholar
Martindale Colin (1977) Syntactic and semantic correlates of verbal tics in Gilles de la Tourette’s syndrome: A quantitative case study. „Brain and Language”, vol. 4, s. 231–247.
Google Scholar
DOI: https://doi.org/10.1016/0093-934X(77)90020-7
Martindale Colin (1990) The clockwork muse: The predictability of artistic change. New York: Basic Books.
Google Scholar
Michel Jean-Baptistei in. (2011) Quantitative Analysis of Culture Using Millions of Digitized Books. „Science”, vol. 331, s. 176–182.
Google Scholar
DOI: https://doi.org/10.1126/science.1199644
Nasukawa Tetsuya, Yi Jeonghee (2003) Sentiment analysis: Capturing favorability using natural language processing [w:] Proceedings of the Conference on Knowledge Capture (K-CAP) s. 70–77 [dostęp 1 maja 2014 r.]. Dostępny w Internecie http://tredocs.com/tw_files2/urls_41/40/d-39217/7z-docs/7.pdf
Google Scholar
Nielsen Finn Å. (2011) A new ANEW: Evaluation of a word list for sentiment analysis in microblogs [ w:] R oweMatthew i in., eds., Proceedings of the ESWC2011 Workshop on ‘Making Sense of Microposts’: Big things come in small packages 718 in CEUR Workshop Proceedings, Heraklion, s. 93–98 [dostęp 1 maja 2014 r.]. Dostępny w Internecie http://ceur-ws.org/Vol-18/msm2011_proceedings.pdf
Google Scholar
Pagel Mark, Atkinson Quentin D., Meade Andrew (2007) Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. „Nature”, vol. 449, s. 717–720.
Google Scholar
DOI: https://doi.org/10.1038/nature06176
Pang Bo, Lee Lillian (2002) Thumbs up? Sentiment Classification using Machine Learning Techniques.„EMNLP ‘02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing”, vol. 10, s. 79–86.
Google Scholar
DOI: https://doi.org/10.3115/1118693.1118704
Pang Bo, Lee Lillian (2008) Opinion Mining and Sentiment Analysis. „Foundations and Trends in Information Retrieval”, vol. 2, s. 1–135.
Google Scholar
DOI: https://doi.org/10.1561/1500000011
Rorty Richard (1996) Przygodność, ironia i solidarność. Przełożył Wacław J. Popowski. Warszawa: Spacja.
Google Scholar
Rorty Richard (1999) Obiektywność, relatywizm i prawda. Przełożył Janusz Margański. Warszawa: Aletheia.
Google Scholar
Tong Richard M. (2001) An operational system for detecting and tracking opinions in on-line discussion [w:] Working Notes of the SIGIR Workshop on Operational Text Classification. New York: ACM, s. 1–6.
Google Scholar
Yi Jeonghee i in. (2003) Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques [w:] Proceedings of the Third IEEE International Conference on Data Mining (ICDM’03). Washington: IEEE Computer Society, s. 427–434.
Google Scholar
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.