Topic Modeling in Sociology Using Social Welfare as an Example: Methodological Challenges and the Human Component




topic modeling, methodology of sociology, social welfare, machine learning, Natural Language Processing


Considering the dynamically evolving realms of social sciences influenced by network technologies and digital humanities, it is crucial to examine the adequacy of sociological data analysis methodologies in these new conditions. The availability of extensive digitized datasets not only poses a challenge to “classical” analysis methods developed under different circumstances and for different purposes, but also raises the question of whether the traditional demarcation between quantitative and qualitative methods, marked by a clear boundary, remains relevant in the era of Big Data. In this paper, based on topic modeling utilising Latent Dirichlet Allocation (LDA), we argue that quantitative methods (probabilistic statistical models) are not merely complementary or a starting point for qualitative analyses (the standard approach), but, rather, constitute an integral part of them. This thesis is illustrated through a case study involving the identification of themes within a dataset of 17,278 articles published in Web-of-Science-indexed journals between 1992 and 2020, focusing on social welfare. This empirical case study also serves to formulate meta-theoretical observations regarding the “cohesion” of quantitative and qualitative methods in the context of machine learning and natural language processing.


Download data is not yet available.

Author Biographies

Piotr Cichocki, Uniwersytet im. Adama Mickiewicza w Poznaniu

Doktor, socjolog, pracownik badawczo-dydaktyczny zatrudniony na Wydziale Socjologii Uniwersytetu im. Adama Mickiewicza w Poznaniu. Zainteresowania badawcze: monitorowanie postaw społecznych i politycznych w badaniach międzykrajowych, maszynowa analiza tekstu oraz metodologia badań sondażowych.

Mariusz Baranowski, Uniwersytet im. Adama Mickiewicza w Poznaniu

Doktor, socjolog, pracownik badawczo-dydaktyczny zatrudniony na Wydziale Socjologii Uniwersytetu im. Adama Mickiewicza w Poznaniu. Zainteresowania badawcze: socjologia ekonomiczna, socjologia polityki oraz zagadnienia związane z dobrobytem społecznym i transformacją energetyczną.


Adler Matthew D. (2019), Measuring Social Welfare: An Introduction, Oxford: Oxford University Press.
Google Scholar

Akhmedov Farkhod, Abdusalomov Akmalbek, Makhmudov Fazliddin, Cho Young I. (2021), LDA-Based Topic Modeling Sentiment Analysis Using Topic/Document/Sentence (TDS) Model, „Applied Sciences”, vol. 11(23), 11091,
Google Scholar

Altbach Philip G., Wit Hans de (2018), Too much academic research is being published, „University World News”, 7 September, [dostęp: 24.09.2024].
Google Scholar

Ananiadou Sophia, Rea Brian, Okazaki Naoaki, Procter Rob, Thomas James (2009), Supporting Systematic Reviews Using Text Mining, „Social Science Computer Review”, vol. 27(4), s. 509–523,
Google Scholar

Asmussen Claus Boye, Møller Charles (2019), Smart literature review: a practical topic modelling approach to exploratory literature review, „Journal of Big Data”, vol. 6(93), s. 1–18,
Google Scholar

Baranowski Mariusz (2022), Epistemological aspect of topic modelling in the social sciences: Latent Dirichlet Allocation, „Przegląd Krytyczny”, vol. 4(1), s. 7–16,
Google Scholar

Baranowski Mariusz, Cichocki Piotr (2021), Good and bad sociology: does topic modelling make a difference?, „Society Register”, vol. 5(4), s. 7–22,
Google Scholar

Baranowski Mariusz, Cichocki Piotr, McKinley Jim (2023), Social welfare in the light of topic modelling, „Sociology Compass”, vol. 17(8), e13086,
Google Scholar

Battista Daniele (2024), Political communication in the age of artificial intelligence: an overview of deepfakes and their implications, „Society Register”, vol. 8(2), s. 7–24,
Google Scholar

Blei David M., Ng Andrew Y., Jordan Michael I. (2003), Latent Dirichlet Allocation, „Journal of Machine Learning Research”, vol. 3, s. 993–1022.
Google Scholar

Carlsen Hjalmar, Ralund Snore (2022), Computational grounded theory revisited: From computer-led to computer-assisted text analysis, „Big Data & Society”, vol. 9(1),
Google Scholar

Cartwright Dorwin P. (1965), Zastosowania analizy treści, [w:] Stefan Nowak (red.), Metody badań socjologicznych, Warszawa: Państwowe Wydawnictwo Naukowe, s. 149–161.
Google Scholar

Ciziceno Marco (2024), Who will take care of them? A reflection on Southern European welfare regimes, „Society Register”, vol. 8(1), s. 27–42,
Google Scholar

DiMaggio Paul, Nag Manish, Blei David (2013), Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of US government arts funding, „Poetics”, vol. 41(6), s. 570–606,
Google Scholar

Duan Jingyuan, Tian Ling, Mao Jianqiao, Li Jiaxin (2022), Optimal social welfare: A many-to-many data transaction mechanism based on double auctions, „Digital Communications and Networks”, vol. 9(5), s. 1230–1241,
Google Scholar

Evans James A., Aceves Pedro (2016), Machine Translation: Mining Text for Social Theory, „Annual Review of Sociology”, vol. 42, s. 21–50,
Google Scholar

Forder Anthony, Caslin Terry, Ponton Geoffrey, Walklate Sandra (2019), Theories of welfare, London: Routledge.
Google Scholar

Hirschberg Julia, Manning Christopher D. (2015), Advances in natural language processing, „Science”, vol. 349(6245), s. 261–266,
Google Scholar

Isoaho Karoliina, Gritsenko Daria, Mäkelä Eetu (2021), Topic Modeling and Text Analysis for Qualitative Policy Research, „Policy Studies Journal”, vol. 49, s. 300–324,
Google Scholar

Jabkowski Piotr, Cichocki Piotr, Kołczyńska Marta (2023), Multi-Project Assessments of Sample Quality in Cross-National Surveys: The Role of Weights in Applying External and Internal Measures of Sample Bias, „Journal of Survey Statistics and Methodology”, vol. 11(2), s. 316–339,
Google Scholar

Jacobs Thomas, Tschötschel Robin (2019), Topic models meet discourse analysis: a quantitative tool for a qualitative approach, „International Journal of Social Research Methodology”, vol. 22(5), s. 469–485,
Google Scholar

Jakubowska Honorata, Cichocki Piotr, Jabkowski Piotr (2023), References to sex and gender differences in the social sciences: analysis of journal publication records (1971–2021), „Ruch Prawniczy, Ekonomiczny i Socjologiczny”, vol. 85(4), s. 275–297,
Google Scholar

Jäger Friedrich, Wiskind Ora (1991), Culture or Society? The Significance of Max Weber’s Thought for Modern Cultural History, „History and Memory”, vol. 3(2), s. 115–140,
Google Scholar

Koseoglu Suzan, Bozkurt Aras (2018), An exploratory literature review on open educational practices, „Distance Education”, vol. 39(4), s. 441–461,
Google Scholar

Lasswell Harold D. (1927), The Theory of Political Propaganda, „The American Political Science Review”, vol. 21(3), s. 627–631,
Google Scholar

Lewis Seth C., Zamith Rodrigo, Hermida Alfred (2013), Content Analysis in an Era of Big Data: A Hybrid Approach to Computational and Manual Methods, „Journal of Broadcasting & Electronic Media”, vol. 57(1), s. 34–52,
Google Scholar

Linares Julio, Cabaña Gabriela (2022), Towards an ecology of care: basic income after the nation-state, „Society Register”, vol. 6(3), s. 29–56,
Google Scholar

Mayntz Renate, Holm Kurt, Hübner Peter (1976), Wprowadzenie do metod socjologii empirycznej, Warszawa: Państwowe Wydawnictwo Naukowe.
Google Scholar

Midgley James (1997), Social Welfare in Global Context, London: Sage Publications.
Google Scholar

Mohr John W., Bogdanov Petko (2013), Introduction – Topic models: What they are and why they matter, „Poetics”, vol. 41(6), s. 545–569,
Google Scholar

Naskar Debashis, Mokaddem Sidahmed, Rebollo Miguel, Onaindia Eva (2016), Sentiment analysis in social networks through topic modeling, [w:] Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis (eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portorož: European Language Resources Association, s. 46–53.
Google Scholar

Nelson Laura (2020), Computational Grounded Theory: A Methodological Framework, „Sociological Methods & Research”, vol. 49(1), s. 3–42,
Google Scholar

Nesterova Iana (2023), Responsibilities towards places in a degrowth society: How firms can become more responsible via embracing deep ecology, „Society Register”, vol. 7(1), s. 53–74,
Google Scholar

Pääkkönen Juho, Ylikoski Petri (2021), Humanistic interpretation and machine learning, „Synthese”, vol. 199, s. 1461–1497,
Google Scholar

Praag Bernard M.S. van (1989), The Relativity of the Welfare Concept, „World Institute for Development Research of the United Nations University, Working Paper”, no. 69, s. 1–43.
Google Scholar

R Core Team (2022), _R: A Language and Environment for Statistical Computing_, „R Foundation for Statistical Computing”, Vienna, [dostęp: 24.09.2024].
Google Scholar

Roberts Margaret E., Stewart Brandon M., Tingley Dustin (2019), stm: An R Package for Structural Topic Models, „Journal of Statistical Software”, vol. 91(2), s. 1–40,
Google Scholar

Silge Julia, Robinson David (2017), Text Mining with R: A Tidy Approach, Sebastopol: O’Reilly.
Google Scholar

Snyder Hannah (2019), Literature review as a research methodology: An overview and guidelines, „Journal of Business Research”, vol. 104, s. 333–339,
Google Scholar

Syed Shaheen, Spruit Marco (2018), Selecting Priors for Latent Dirichlet Allocation, [w:] IEEE 12th International Conference on Semantic Computing (ICSC), Laguna Hills: IEEE s. 194–202,
Google Scholar

Thangaraj Muthuraman, Sivakami Muthusamy (2018), Text Classification Techniques: A Literature Review, „Interdisciplinary Journal of Information, Knowledge, and Management”, vol. 13, s. 117–135,
Google Scholar

Timms Noel (1980), Social welfare: Why and how?, London: Routledge.
Google Scholar

Titmuss Richard M. (1967), The Welfare Complex in a Changing Society, „The Milbank Memorial Fund Quarterly”, vol. 45(1), s. 9–23,
Google Scholar


2024-11-30 — Updated on 2025-01-10


How to Cite

Cichocki, P., & Baranowski, M. (2025). Topic Modeling in Sociology Using Social Welfare as an Example: Methodological Challenges and the Human Component. Przegląd Socjologii Jakościowej, 20(4), 98–117. (Original work published November 30, 2024)



Numer tematyczny: „Metody humanistyki cyfrowej w socjologii jakościowej”

Funding data

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.