Table of Contents
Fetching ...

United States Politicians' Tone Became More Negative with 2016 Primary Campaigns

Jonathan Külz, Andreas Spitz, Ahmad Abu-Akel, Stephan Günnemann, Robert West

Abstract

There is a widespread belief that the tone of US political language has become more negative recently, in particular when Donald Trump entered politics. At the same time, there is disagreement as to whether Trump changed or merely continued previous trends. To date, data-driven evidence regarding these questions is scarce, partly due to the difficulty of obtaining a comprehensive, longitudinal record of politicians' utterances. Here we apply psycholinguistic tools to a novel, comprehensive corpus of 24 million quotes from online news attributed to 18,627 US politicians in order to analyze how the tone of US politicians' language evolved between 2008 and 2020. We show that, whereas the frequency of negative emotion words had decreased continuously during Obama's tenure, it suddenly and lastingly increased with the 2016 primary campaigns, by 1.6 pre-campaign standard deviations, or 8% of the pre-campaign mean, in a pattern that emerges across parties. The effect size drops by 40% when omitting Trump's quotes, and by 50% when averaging over speakers rather than quotes, implying that prominent speakers, and Trump in particular, have disproportionately, though not exclusively, contributed to the rise in negative language. This work provides the first large-scale data-driven evidence of a drastic shift toward a more negative political tone following Trump's campaign start as a catalyst, with important implications for the debate about the state of US politics.

United States Politicians' Tone Became More Negative with 2016 Primary Campaigns

Abstract

There is a widespread belief that the tone of US political language has become more negative recently, in particular when Donald Trump entered politics. At the same time, there is disagreement as to whether Trump changed or merely continued previous trends. To date, data-driven evidence regarding these questions is scarce, partly due to the difficulty of obtaining a comprehensive, longitudinal record of politicians' utterances. Here we apply psycholinguistic tools to a novel, comprehensive corpus of 24 million quotes from online news attributed to 18,627 US politicians in order to analyze how the tone of US politicians' language evolved between 2008 and 2020. We show that, whereas the frequency of negative emotion words had decreased continuously during Obama's tenure, it suddenly and lastingly increased with the 2016 primary campaigns, by 1.6 pre-campaign standard deviations, or 8% of the pre-campaign mean, in a pattern that emerges across parties. The effect size drops by 40% when omitting Trump's quotes, and by 50% when averaging over speakers rather than quotes, implying that prominent speakers, and Trump in particular, have disproportionately, though not exclusively, contributed to the rise in negative language. This work provides the first large-scale data-driven evidence of a drastic shift toward a more negative political tone following Trump's campaign start as a catalyst, with important implications for the debate about the state of US politics.
Paper Structure (3 sections, 4 equations, 7 figures, 1 table)

This paper contains 3 sections, 4 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Quantifying the evolution of negative language in US politics (2008--2020).(a) The black points show the fraction of negative emotion words, averaged monthly over all quotes from all 18,627 quoted politicians. The red vs. blue background shows the quote share of Trump vs. Obama (if Trump had $T$ quotes and Obama had $O$ quotes in a given month, the respective red bar covers a fraction $T/(T+O)$ of the full $y$-range). Whereas the frequency of negative emotion words had decreased continuously during the first 6.5 years of Obama's tenure, it suddenly and lastingly increased in June 2015, when Trump's primary campaign started and his quote share began to surpass Obama's. (b) Regression analysis: The black points again show the fraction of negative emotion words, but now as $z$-scores (i.e., after subtracting the pre-campaign mean and dividing by the pre-campaign standard deviation). In red, we plot regression lines for the periods before and after June 2015. The coefficients of the ordinary least squares regression model $y_t = \alpha_0 + \beta_0 \,t + \alpha \,i_{t} + \beta \,i_{t} \,t + \varepsilon_{t}$ (where $t$ is the number of months since June 2015, and $i_t$ indicates whether $t \geq 0$; cf. Eq. \ref{['eqn:regression']}) quantify the slopes of both lines, as well as the sudden increase of $\alpha=1.6$ pre-campaign standard deviations coinciding with the discontinuity in June 2015 ($t=0$), as visualized.
  • Figure 2: Temporal evolution of negative language. Columns correspond to negative-language word categories from LIWC; rows correspond to aggregation methods for computing monthly averages. Points show monthly averages, expressed as pre- campaign $z$-scores (i.e., subtracting pre- campaign mean from raw frequency values, and dividing by pre- campaign standard deviation). Lines (with 95% confidence intervals) were obtained via ordinary least squares regression, with coefficients shown in legends (cf. Eq. \ref{['eqn:regression']} and Fig. \ref{['fig:fig0']}(b) for interpretation of coefficients; tabular summary in Supplementary Tables S4, S6, S8, S9). (a--e)Quote-level aggregation micro- averages over all quotes per month, i.e., speakers have weight proportional to their number of quotes in the respective month. Panel (a) shows the same data as Fig. \ref{['fig:fig0']}. (f--j)Speaker-level aggregation macro- averages by speaker, i.e., all speakers with at least one quote in a given month have equal weight in that month. (k--o)Quote-level aggregation by party performs the analysis of (a--e), but separately for quotes from Democrats vs. Republicans (coefficients omitted for clarity; cf. Supplementary Tables S8 and S9). Significance of regression coefficients: *** $p<0.001$, ** $p<0.01$, * $p<0.05$. We observe drastic shifts toward a more negative tone at the modeled June 2015 discontinuity (Trump's campaign start).
  • Figure 3: Role of speaker prominence. The set of 18,627 US politicians was split into four evenly-sized quartiles with respect to their total number of quotes (i.e., prominence); each panel shows the time series for negative emotion words obtained by performing monthly speaker-level aggregation on the respective quartile separately. That is, the figure shows the data of Fig. \ref{['fig:emotions']}(f) after stratifying speakers by prominence. Lines (with 95% confidence intervals) were obtained via ordinary least squares regression, with coefficients in legends (cf. Eq. \ref{['eqn:regression']} and Fig. \ref{['fig:fig0']}(b) for interpretation of coefficients). (Tabular summary of regression coefficients in Supplementary Tables S11, S12, S13, and S14.) Significance of regression coefficients: *** $p<0.001$, ** $p<0.01$, * $p<0.05$. We observe that the abrupt increase in negative emotion words emerges in all strata of speaker prominence except the least prominent stratum, and that quotes by more prominent speakers overall contain more negative emotion words. The figure focuses on one category of negative- language words (negative emotion); for the other four categories, see Supplementary Fig. S3.
  • Figure 4: Biographic correlates of negative language. Coefficients (with 95% confidence intervals) of ordinary least squares regression (Eq. \ref{['eqn:extended_regression']}) for modeling time series of word categories (quote-level aggregation; speaker-level aggregation in Supplementary Fig. S6) while adjusting for party affiliation ($\gamma$), the party's federal role ($\delta$), Congress membership ($\zeta$), and gender ($\eta$) (tabular summary in Supplementary Tables S15 and S16). Positive coefficients mark word categories that are, ceteris paribus, used more frequently by Democrats than by Republicans, by members of the governing than by members of the opposition party, by Congress members than by others, or by females than by males (and vice versa for negative coefficients). We observe that quotes by members of the opposition party, Congress members, and Democrats contain significantly more negative language. Importantly, the sudden June 2015 jump in negative language ($\alpha$) remains significant after adjusting for biographic attributes.
  • Figure 5: Role of individual politicians: single-speaker study. Results of ordinary least squares regressions (Eq. \ref{['eqn:regression']}) fitted separately to the time series of each of the 200 most quoted speakers. (a--e) Each speaker's $\alpha$ coefficient (capturing the size of the June 2015 jump, with 95% confidence intervals) as a function of the speaker's number of quotes. Significant coefficients ($p<0.05$) in color, others in gray. (f--j) Fraction of speakers with positive $\alpha$ among the speakers with at least $n$ quotes, as a function of $n$. We observe that, although many individual $\alpha$ coefficients are non-significant (a--e), the majority of coefficients are positive, particularly among the most-quoted speakers (as manifested in the increasing curves of (f--j)). That is, the June 2015 jump in negative language emerges even at the individual level for a majority of the most-quoted politicians.
  • ...and 2 more figures