Table of Contents
Fetching ...

Auditing Algorithmic Personalization in TikTok Comment Sections

Yueru Yan, Siqi Wu

Abstract

Personalization algorithms are ubiquitous in modern social computing systems, yet their effects on comment sections remain underexplored. In this work, we conducted an algorithmic auditing experiment to examine comment personalization on TikTok. We trained sock-puppet accounts to exhibit left-leaning or right-leaning preferences and successfully validated 17 of them by analyzing the videos recommended on their For You Pages. We then scraped the comment sections shown to these trained partisan accounts, along with five cold-start accounts, across 65 politically neutral videos related to the 2024 U.S. presidential election that contain abundant discussions from both left-leaning and right-leaning perspectives. We find that while the composition of top comments remains largely consistent for all videos, ranking divergence between accounts from different political groups is significantly greater than that observed within the same group for some videos. This effect is strongly correlated with video-level metrics such as comment volume, engagement inequality, and partisan skew in the comment sections. Furthermore, through an exploratory case study, we find preliminary evidence that personalization can result in comment exposure aligned with an account's political leaning. However, this pattern is not universal, suggesting that the extent of politically oriented comment personalization is context-dependent.

Auditing Algorithmic Personalization in TikTok Comment Sections

Abstract

Personalization algorithms are ubiquitous in modern social computing systems, yet their effects on comment sections remain underexplored. In this work, we conducted an algorithmic auditing experiment to examine comment personalization on TikTok. We trained sock-puppet accounts to exhibit left-leaning or right-leaning preferences and successfully validated 17 of them by analyzing the videos recommended on their For You Pages. We then scraped the comment sections shown to these trained partisan accounts, along with five cold-start accounts, across 65 politically neutral videos related to the 2024 U.S. presidential election that contain abundant discussions from both left-leaning and right-leaning perspectives. We find that while the composition of top comments remains largely consistent for all videos, ranking divergence between accounts from different political groups is significantly greater than that observed within the same group for some videos. This effect is strongly correlated with video-level metrics such as comment volume, engagement inequality, and partisan skew in the comment sections. Furthermore, through an exploratory case study, we find preliminary evidence that personalization can result in comment exposure aligned with an account's political leaning. However, this pattern is not universal, suggesting that the extent of politically oriented comment personalization is context-dependent.

Paper Structure

This paper contains 26 sections, 1 equation, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Overview of the experiment workflow. The process begins with constructing an auxiliary dataset to identify valid partisan channels for training and center videos for testing. We then follow a "train-validate-evaluate" framework: (1) training sock-puppet accounts to perform "search-then-watch" actions for selected partisan channels, (2) validating their platform-perceived political leanings by examining the videos recommended on FYP feeds, and (3) scraping top-level comments and their ordering from test videos to measure how the algorithm alters comment visibility for different political groups.
  • Figure 3: Heatmap of Spearman's rank correlations between video characteristics and personalization metrics. Ranking divergence (NDLD columns) increases significantly with discussion volume, engagement inequality, and imbalance of partisan discussion, whereas content composition (JD columns) shows little to no correlation with these factors.
  • Figure 4: $k$-means clustering of test videos based on comment section features. The projection reveals three distinct clusters, notably Cluster 2 (yellow), which contains videos characterized by high engagement (high PC1) and strong personalization (low PC2).
  • Figure 5: Comparison of $\mathbb{R}_{L,R}$(NDLD) across the three video clusters. Videos in Cluster 2 exhibit significantly higher levels of personalization between $L$ and $R$ accounts compared to Cluster 0 and Cluster 1 (Mann–Whitney U test, $p < 0.001$).
  • Figure 6: The difference of partisan exposure, (a) to left-leaning comments and (b) to right-leaning comments, between nine left-leaning and eight right-leaning accounts at position $k$. Panel (a) exhibits a heterogeneous pattern. On video "7424...3694", left-leaning accounts were exposed to more left-leaning comments at position $7$ but more right-leaning comments at position $4$. Panel (b) shows a more consistent pattern of negative deviations at top positions (e.g., "7431...3770"), indicating that right-leaning accounts are consistently exposed to higher proportions of right-leaning comments than left-leaning accounts.
  • ...and 3 more figures