Table of Contents
Fetching ...

Dynamics of collective minds in online communities

Seungwoong Ha, Henrik Olsson, Kresimir Jaksic, Max Pellert, Mirta Galesic

TL;DR

This work presents a computational framework that treats online news communities as evolving collective minds encoded as dynamic semantic networks. By calibrating the model with 11 years and over 400 million comments from five US platforms and validating against a large human survey, the authors quantify how editorial practices (alignment, amplification, reframing) and community dynamics (turnover, trolls, counterspeech) reshape discourse, with memory and filtering controlling persistence. Key findings show that alignment can be reversed quickly, while amplification, reframing, trolls, and turnover can induce lasting changes in the community mind, whereas counterspeech can mitigate Troll effects when sufficiently strong and timely. The framework guides practical interventions for healthier discourse and can generalize to other platforms and multi-layer policy environments, offering a tool for counterfactual experiments on complex digital ecosystems.

Abstract

How communities respond to diverse societal challenges, from economic crises to political upheavals, is shaped by their collective minds - shared representations of ongoing events and current topics. In turn, collective minds are shaped by a continuous stream of influences, amplified by the rapid rise of online platforms. Online communities must understand these influences to maintain healthy discourse and avoid being manipulated, but understanding is hindered by limited observations and the inability to conduct counterfactual experiments. Here, we show how collective minds in online news communities can be influenced by different editorial agenda-setting practices and aspects of community dynamics, and how these influences can be reversed. We develop a computational model of collective minds, calibrated and validated with data from 400 million comments across five U.S. online news platforms and a large-scale survey. The model enables us to describe and experiment with a variety of influences and derive quantitative insights into their magnitude and persistence in different communities. We find that some editorial influences can be reversed relatively rapidly, but others, such as amplification and reframing of certain topics, as well as community influences such as trolling and counterspeech, tend to persist and durably change the collective mind. These findings illuminate ways collective minds can be manipulated and pathways for communities to maintain healthy and authentic collective discourse amid ongoing societal challenges.

Dynamics of collective minds in online communities

TL;DR

This work presents a computational framework that treats online news communities as evolving collective minds encoded as dynamic semantic networks. By calibrating the model with 11 years and over 400 million comments from five US platforms and validating against a large human survey, the authors quantify how editorial practices (alignment, amplification, reframing) and community dynamics (turnover, trolls, counterspeech) reshape discourse, with memory and filtering controlling persistence. Key findings show that alignment can be reversed quickly, while amplification, reframing, trolls, and turnover can induce lasting changes in the community mind, whereas counterspeech can mitigate Troll effects when sufficiently strong and timely. The framework guides practical interventions for healthier discourse and can generalize to other platforms and multi-layer policy environments, offering a tool for counterfactual experiments on complex digital ecosystems.

Abstract

How communities respond to diverse societal challenges, from economic crises to political upheavals, is shaped by their collective minds - shared representations of ongoing events and current topics. In turn, collective minds are shaped by a continuous stream of influences, amplified by the rapid rise of online platforms. Online communities must understand these influences to maintain healthy discourse and avoid being manipulated, but understanding is hindered by limited observations and the inability to conduct counterfactual experiments. Here, we show how collective minds in online news communities can be influenced by different editorial agenda-setting practices and aspects of community dynamics, and how these influences can be reversed. We develop a computational model of collective minds, calibrated and validated with data from 400 million comments across five U.S. online news platforms and a large-scale survey. The model enables us to describe and experiment with a variety of influences and derive quantitative insights into their magnitude and persistence in different communities. We find that some editorial influences can be reversed relatively rapidly, but others, such as amplification and reframing of certain topics, as well as community influences such as trolling and counterspeech, tend to persist and durably change the collective mind. These findings illuminate ways collective minds can be manipulated and pathways for communities to maintain healthy and authentic collective discourse amid ongoing societal challenges.

Paper Structure

This paper contains 50 sections, 13 equations, 21 figures, 9 tables.

Figures (21)

  • Figure 1: Computational model and empirical data.a, Conceptual illustration of the computational model. The world that generates events is characterized by the general semantic network. Events are represented by a triplet of topics (here symbolized by geometric shapes), in which the topic that best describes the event is in the first place, followed by two other related but less relevant topics (tier $1$, $2$, and $3$, respectively). At each time step, communities are exposed to the same set of events. Each community has an editorial filter that accepts or rejects events, affected by both general and community semantic networks. The accepted events become news published on the news site of the community. The community semantic network responds to this news through its comment section, which is characterized by a network of interrelated topics. Finally, the community semantic network is updated based on the feedback from the comment network, which will affect the filtering process of the next time step. b, Data collection for calibrating the computational model. First, we gather titles and comments from online news articles, get their BERT embeddings, and use BERTopic to derive topics. We characterize the title by a triplet of topics that best describe it, and each comment by its most relevant topic. For a given time interval, we count the number of comments discussing each topic ($f_i$) and average the embeddings of all such comments to get the topic representation ($e_i$). Finally, we assign weights ($w_{ij}$) for each pair of topics as a cosine similarity of their representations.
  • Figure 2: Quantitative comparison of data and model output.a, Relative frequency distributions of topics in article titles in tiers 1, 2, and 3, in each community, compared with the model simulation (right-most panel). b, Relative frequency distributions of topics in comments in each community (left), compared with the model simulation (right panel). c, Topic similarity distributions in each community (left), compared with the model simulation (right panel). f, Topic similarity distributions from the model simulation. The topic frequency distributions (a-b) are sorted by their topic ranks, normalized by the total number of topics (see Method). Thick dashed lines indicate the distributions used to calibrate the computational model, and thin dashed lines indicate the best-fitting lines for individual communities (see Supplementary Table 7 for best-fitting parameters). Error bars indicate $\pm1$ standard deviations across $10,000$ simulations of $120$ time steps (equivalent to $10$ years).
  • Figure 3: Qualitative comparison of data and model output.a, Examples of empirically observed trends in topic frequencies in article titles (tier 1), for four illustrative topics (Vaccine, Climate, Guns, Abortion) discussed in online news communities Mother Jones (MJ), Atlantic (AT), The Hill (TH), Breitbart (BB), and Gateway Pundit (GP; left panel). We highlight two external shocks that were related to high peaks in the title frequency of the Vaccine topic: the COVID-19 pandemic (b, left) and the US Ebola outbreak (c, left). b-c, Empirical differences between communities (left panels) can be reproduced in model simulation by tuning the filter strength $\lambda_f$ during the external shock (right panels). The external shock increases the target topic frequency in the general semantic network (insets in right panels). The error bars indicate $\pm1$ standard deviations across $10,000$ simulations. d-e, Selected representative examples of diverse qualitative trends of comment topic frequency (increasing, decreasing, oscillating in time, with single or multiple peaks) (d) and topic similarity (e), observed in the empirical data (left panels) and the corresponding model output (right panels).
  • Figure 4: Impact of editorial influences on the community semantic networks in the computational model.a-d, Alignment is represented as the strength of the community filter ($\lambda_f$, a). It slows down the movement of the comment network relative to the general semantic network, keeping it in its initial position. When the initial position of the community semantic network is the same as (far from) that of the general semantic network, alignment makes the comment network more (less) similar to the general semantic network (b-c) for all memory strengths ($\lambda_m$), as measured by Kendall-tau rank distance between the networks. The effect quickly disappears once the alignment is removed (d). e-h, Amplification is represented as a subjective increase ($s_{\text{amp}}$) in the frequency of the target topic in the general semantic network, as perceived by the editors (e). It increases the frequency of the target topic in the news (f) and the comments (g) for all filter strengths, and its effect remains even after it is removed. It also increases the similarity between the target topic and other topics, especially for the initially more similar topics (top vs. bottom $20\%$; Fig. \ref{['fig:4']}h). i-l. Reframing is implemented by replacing one of the topics in the news that has passed the filter by a target topic (i), with probability $p_{\text{ref}}$. When applied to topics in tier 2 of news, it increases the frequency of that topic in tier 2 (j), but over time also in tier 1 (k), with both effects persisting after reframing is removed. It also increases the similarity between the target topic and other topics, especially for the initially less similar (l). The error bars indicate $\pm1$ standard deviations across $10,000$ simulations. The gray zone indicates the influence period. All ratios and differences are relative to the baselines without influences. Semi-transparent lines represent raw data and solid lines indicate denoised data. Exceptions are the similarity differences (h, l) where all lines indicate raw data.
  • Figure 5: Impact of community influences on the community semantic networks in the computational model.a-d, Membership turnover is implemented as a decrease in community memory strength ($\lambda_m$, a). It accelerates the movement of the comment network relative to the general semantic network. When the initial position of the community semantic network is the same as (far from) that of the general semantic network, turnover makes the comment network less (more) similar to the general semantic network (b-c) for all memory strengths ($\lambda_m$), as measured by Kendall-tau rank distance between the networks. Once the turnover stops, its effect persists (d). e-h, Trolls are implemented by increasing the frequency of comments discussing a target topic unrelated to the news (e, $s_{\text{tr}}$). They increase the frequency of the target topic in the comment network (f) and in the news (g) for all memory and filter strengths. This effect persists for a long time even after the trolls are removed, but the t-SNE plot of the comment topic profile reveals that eventually, the comment network returns to its original position (h). Counterspeech is implemented as increasing the frequency of comments related to the news (i, $s_{\text{cs}}$). It decreases the frequency of the target topic promoted by trolls, but it needs to be much stronger than the troll influence to remove their effect completely (j). The sooner the counterspeech is introduced, the more effective it is against trolls (k). Unlike the removal of trolls, this does not return the comment network to its original position (l). The error bars indicate $\pm1$ standard deviations across $10,000$ simulations. The gray zone indicates the influence period. All ratios and differences are relative to the baselines without influences. Semi-transparent lines represent raw data and solid lines indicate denoised data. For t-SNE plots (h, l), the raw time series of t-SNE coordinates (averaged over $1,000$ simulations) are represented by semi-transparent markers while the smoothed time series (by averaging over $25$ time step interval) are plotted with larger markers connected by arrows.
  • ...and 16 more figures