Political Alignment Identification: a Study with Documents of Argentinian Journalists
Political alignment identification is an author profiling task that aims at identifying political bias/orientation in people’ writings. As usual in any automatic text analysis, a critical aspect here is having available adequate data sets so that the data mining and machine learning approaches can obtain reliable and informative results.
This article makes a contribution in this regard by presenting a new corpus for the study of political alignment in documents of Argentinian journalists. The
study also includes several kinds of analysis of documents of pro-government and opposition journalists such as the relevance of terms in each journalist class,
sentiment analysis, topic modelling and the analysis of psycholinguistic indicators obtained from the Linguistic Inquiry and Word Count (LIWC) system. From the experimental results, interesting patterns could be observed such as the topics both types of journalists write about, how the sentiment polarities are distributed and how the writings of pro-government and opposition journalists differ in the distinct LIWC categories.
E. Stamatatos, “A survey of modern authorship attribution methods,” Journal of the American Society for Information Science and Technology, vol. 60, no. 3, pp. 538–556, 2009.
R. Cohen and D. Ruths, “Classifying political orientation on twitter: It’s not easy!,” in Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, 2013.
M. D. Conover, B. Gonc¸alves, J. Ratkiewicz, A. Flammini, and F. Menczer, “Predicting the political alignment of twitter users,” in Proceedings of 3rd IEEE Conference on Social Computing (SocialCom), pp. 192–199, 10 2011.
K. Lazaridou and R. Krestel, “Identifying political bias in news articles,” Bulletin of the IEEE TCDL, vol. 12, 2016.
A. Tumasjan, T. O. Spenger, P. G. Sandner, and I. M. Welpe, “Predicting elections with twitter: What 140 characters reveal about political sentiment.,” Proceedings of the Fourth International AAAI conference on Weblogs and Social Media (ICWSM), Washington, S. 178-185., (2010).
R. B. Slatcher, C. K. Chung, J. W. Pennebaker, and L. D. Stone, “Winning words: individual differences in linguistic style among u. s. presidential and vice presidential candidates. j,” ournal of Research in Personality, vol. 41, pp.63-75., (2007).
J. W. Pennebaker and T. C. Lay, “Language use and personality during crises: analyses of mayor rudolph giuliani’s press conferences.,” Journal of Research in Personality, vol. 36, pp.271-282., (2002).
M. J. Carrera-Fern´andez, J. Gu´ardia-Olmos, and M. Per´o-Cebollero, “Linguistic style in the mexican electoral process: Language style matching analysis,” Revista Mexicana de Psicolog´ıa, 31(2), 138-152., (2014).
M. Fern´andez-Cabana, J. R´uas-Ara´ujo, and M. T. Alves-P´erez, “Psicolog´ıa, lenguaje y comunicaci´on: an´alisis con la herramienta liwc de los discursos y tweets de los candidatos a las elecciones gallegas de 2012,” Anuario de Psicolog´ıa, 44(2), pp.169-184., (2014).
J. R´uas-Ara´ujo, M. Alves-P´erez, and M. Fern´andez-Cabana, “Comunicaci´on, lenguaje y pol´ıtica: An´alisis de los discursos institucionales del presidente de ecuador, rafael correa (2007-2015), con la herramienta liwc. communication, language and politics: Analysis of the institutional discourses of the president of ecuador, rafael correa (2007-2015), with the tool liwc.,” Raz´on y Palabra. vol.20, nro. 4-95, pp.591-607, (2017).
V. Mercado, A. Villagra, and M. Errecalde, “Exploratory analysis of a new corpus for political alignment identification of argentinian journalists,” in Actas del XXV Congreso Argentino de Ciencias de la Computación (CACIC 2019), pp. 507–516, 2019.
J. Serrano-Guerrero, J. A. Olivas, F. P. Romero, and E. Herrera-Viedma, “Sentiment analysis: A review and comparative analysis of web services,” Information Sciences, vol. 311, pp. 18 – 38, 2015.
S. G. Burdisso, M. Errecalde, and M. M. y G´omez, “Pyss3: A python package implementing a novel text classifier with visualization tools for explainable ai - arxiv 1912.09322,” 2019.
S. G. Burdisso, M. Errecalde, and M. M. y G´omez, “A text classification framework for simple and effective early depression detection over social media streams,” Expert Systems with Applications, vol. 133, pp. 182 – 197, 2019.
S. G. Burdisso, M. L. Errecalde, and M. Montes y G´omez, “Unsl at erisk 2019: a unified approach for anorexia, self-harm and depression detection in social media,” in Working Notes of the Conference and Labs of the Evaluation Forum - CEUR Workshop Proceedings, vol. 2380, 2019.
D. Blei, A. Ng, and M. Jordan, “Latent dirichlet allocation,” Journal of Machine Learning Research 3, no. 993-1022, 2003.
B. Liu, Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, 2012.
M. C. D´ıaz-Galiano, M. G. Vega, E. Casasola, L. Chiruzzo,M. A´ . G. Cumbreras, E.Mart´ınez-Ca´mara, D. Moctezuma, A. M. R´aez, M. A. S. Cabezudo, E. S. Tellez, M. Graff, and S. Miranda-Jim´enez, “Overview of tass 2019: One more further for the global spanish sentiment analysis corpus,” in IberLEF@SEPLN, 2019.
J. W. Pennebaker, R. J. Booth, and M. E. Francis, “Linguistic inquiry and word count (liwc),” [Software]., (2001).
J.W. Pennebaker, R. L. Boyd, K. Jordan, and K. Blackburn, The development and psychometric properties of LIWC2015. University of Texas at Austin, Austin, TX, 2015.
M. P. Villegas, D. Funez, M. J. Ucelay, L. Cagnina, and M. Errecalde, “Lidic - unsl’s participation at erisk 2017 : Pilot task on early detection of depression.,” in Working Notes of the Conference and Labs of the Evaluation Forum - CEUR Workshop Proceedings, vol. 1866, 2017.
T. A. Litvinova, P. V.Seredin, O. A. ALitvinova, and O. V.Romanchenko, “Identification of Suicidal Tendencies of Individuals Based on the Quantitative Analysis of their Internet Texts,” Computaci´on y Sistemas, vol. 21, pp. 243 – 252, 06 2017.
J. R´uas, M. Fern´andez, and I. Puentes, “Aplicaci´on de la herramienta liwc al an´alisis del discurso pol´ıtico. los m´ıtines de los candidatos en las elecciones al parlamento de galicia de 2012.,” in Actas del 2do Congreso Nacional sobre Metodolog´ıa de la Investigaci´on en Comunicaci´on, pp. 47–64, 2013.
Copyright (c) 2020 Viviana Mercado, Andrea Villagra, Marcelo Errecalde
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.