Partisan news, polarized audiences? A quantitative content analysis of digital news and user comments in India

by Apeksha Shetty


Thesis supervisor: dr. Eric L. Tsetsi

Multiple commenters have suggested that India, often called the world’s largest democracy, is beginning to see US-styled polarization characterized by bitter divisions and filtering of issues through a partisan lens (Masih & Slater, 2019; Neyazi, 2017). The new media in the country is often blamed for this shift (Neyazi, 2017; Sahoo, 2020). However, while the partisanship of Indian media has been theorized, quantitative studies on the subject are lacking, particularly with digital news outlets. This thesis drew upon a cross-sectional content analysis of online articles from four news outlets and their user comments to understand the extent of partisan bias of digital news organizations in India and study its links to affective polarization in social media responses of Facebook commenters.


This study used both computer-assisted and manual content analysis to measure partisan bias in the news coverage, affective polarization in the user comments, and study the correlation between the two. The articles (n=486) were manually analyzed using measures of partisan bias outlined in D’Alessio and Allen’s (2000) work.

To measure the presence and extent of affective polarization, this study attempted a hybrid approach involving manual content analysis followed by supervised machine learning. As a first step, 3025 comments were coded manually for the presence (1) or absence (0) of in-group favoritism and out-group hostility. Group identification for both in-group and out-group was also tagged when expressed in the comments. Next, the performance of supervised machine learning and deep learning tools was evaluated on this labelled dataset of ~3000 comments in an attempt to label the remaining ~57,000 comments. However, these methods did not perform well, and the f1-scores were generally low. An overview of the steps taken, their performance metrics, and the potential factors for this poor performance are outlined in the thesis. The poor performance of these methods led to the decision to use sentiment analysis instead. SentiStrength (Thelwall et al., 2010), a sentiment analysis tool for short, informal text, was used. The results of sentiment analysis and the data from the manually coded user comments were used to understand the difference in affective polarization between comments on partisan and centrist news media.

To study the relationship between the user comments and articles, the data were analyzed using Wordfish (Slapin & Proksch, 2008), an unsupervised one-dimensional text scaling method that estimates document positions on observed word frequencies. It assumes that the relative use of words indicates placement in policy space and uses the underlying statistical distribution of word counts to estimate their relative importance (Slapin & Proksch, 2008). In this study, Wordfish was used to estimate the relative position of each article on the left-to-right scale. It was also used similarly with the comments. All comments for an article were combined, and then Wordfish was used to identify the relative positions of each article’s comment section. The scores of articles and their comments sections were then compared to examine correlations. 


The study revealed a greater degree of partisan bias in the news coverage of more ideologically inclined media as compared to centrist digital media outlets in India. This was in line with hypothesized ideological inclinations of the news media. The results of sentiment analysis showed significant differences between commenters on partisan and nonpartisan news outlets, with the comments on the partisan news outlets being generally more negative than comments on nonpartisan news outlets. The manually coded comments also revealed that the user comments section of the media outlets differed in terms of both in-group and out-group identification based on political and religious beliefs. Finally, Wordfish helped identify a positive correlation between the partisan positioning of the news outlets and their associated comments section. This finding indicated that either the pattern of news engagement was linked to partisan identity or there was selective engagement with news based on political identity.

The study adds to the literature on partisanship in media coverage and how it may be linked to polarization and divisiveness on Facebook. Additionally, by using automated and manual analysis of user comments, the study implements an alternative approach to identify and examine polarization on social media. This offers a different way to study polarization when representative surveys or experimental studies are not feasible. Lastly, the study also reveals the unique challenges associated with the automated analysis of social media data in multilingual countries where users frequently code-switch or use non-standard language to communicate. 


D’Alessio, D., & Allen, M. (2000). Media bias in presidential elections: A meta-analysis. Journal of Communication, 50(4), 133–156.

Masih, N., & Slater, J. (2019, May 20). U.S.-style polarization has arrived in India. Modi is at the heart of the divide. Washington Post.

Neyazi, T. A. (2017, November). Social media and political polarization in India. Seminar, 699, 31–35.

Sahoo, N. (2020). Mounting Majoritarianism and Political Polarization in India (Political Polarization in South and Southeast Asia: Old Divisions, New Dangers, pp. 9–23). Carnegie Endowment for International Peace.

Slapin, J. B., & Proksch, S.-O. (2008). A Scaling Model for Estimating Time-Series Party Positions from Texts. American Journal of Political Science, 52(3), 705–722.

Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., & Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544–2558.