Table of Contents
Fetching ...

Embedding Democratic Values into Social Media AIs via Societal Objective Functions

Chenyan Jia, Michelle S. Lam, Minh Chau Mai, Jeff Hancock, Michael S. Bernstein

TL;DR

A method for translating established, vetted social scientific constructs into AI objective functions, which are term societal objective functions, is introduced and demonstrated with application to the political science construct of anti-democratic attitudes.

Abstract

Can we design artificial intelligence (AI) systems that rank our social media feeds to consider democratic values such as mitigating partisan animosity as part of their objective functions? We introduce a method for translating established, vetted social scientific constructs into AI objective functions, which we term societal objective functions, and demonstrate the method with application to the political science construct of anti-democratic attitudes. Traditionally, we have lacked observable outcomes to use to train such models, however, the social sciences have developed survey instruments and qualitative codebooks for these constructs, and their precision facilitates translation into detailed prompts for large language models. We apply this method to create a democratic attitude model that estimates the extent to which a social media post promotes anti-democratic attitudes, and test this democratic attitude model across three studies. In Study 1, we first test the attitudinal and behavioral effectiveness of the intervention among US partisans (N=1,380) by manually annotating (alpha=.895) social media posts with anti-democratic attitude scores and testing several feed ranking conditions based on these scores. Removal (d=.20) and downranking feeds (d=.25) reduced participants' partisan animosity without compromising their experience and engagement. In Study 2, we scale up the manual labels by creating the democratic attitude model, finding strong agreement with manual labels (rho=.75). Finally, in Study 3, we replicate Study 1 using the democratic attitude model instead of manual labels to test its attitudinal and behavioral impact (N=558), and again find that the feed downranking using the societal objective function reduced partisan animosity (d=.25). This method presents a novel strategy to draw on social science theory and methods to mitigate societal harms in social media AIs.

Embedding Democratic Values into Social Media AIs via Societal Objective Functions

TL;DR

A method for translating established, vetted social scientific constructs into AI objective functions, which are term societal objective functions, is introduced and demonstrated with application to the political science construct of anti-democratic attitudes.

Abstract

Can we design artificial intelligence (AI) systems that rank our social media feeds to consider democratic values such as mitigating partisan animosity as part of their objective functions? We introduce a method for translating established, vetted social scientific constructs into AI objective functions, which we term societal objective functions, and demonstrate the method with application to the political science construct of anti-democratic attitudes. Traditionally, we have lacked observable outcomes to use to train such models, however, the social sciences have developed survey instruments and qualitative codebooks for these constructs, and their precision facilitates translation into detailed prompts for large language models. We apply this method to create a democratic attitude model that estimates the extent to which a social media post promotes anti-democratic attitudes, and test this democratic attitude model across three studies. In Study 1, we first test the attitudinal and behavioral effectiveness of the intervention among US partisans (N=1,380) by manually annotating (alpha=.895) social media posts with anti-democratic attitude scores and testing several feed ranking conditions based on these scores. Removal (d=.20) and downranking feeds (d=.25) reduced participants' partisan animosity without compromising their experience and engagement. In Study 2, we scale up the manual labels by creating the democratic attitude model, finding strong agreement with manual labels (rho=.75). Finally, in Study 3, we replicate Study 1 using the democratic attitude model instead of manual labels to test its attitudinal and behavioral impact (N=558), and again find that the feed downranking using the societal objective function reduced partisan animosity (d=.25). This method presents a novel strategy to draw on social science theory and methods to mitigate societal harms in social media AIs.
Paper Structure (59 sections, 7 figures, 7 tables)

This paper contains 59 sections, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Steps of our societal objective function method: (1) Identify a well-established social scientific construct, such as the construct of anti-democratic attitudes; (2) Operationalize the construct with manual rating methods, as shown with our manual democratic attitude feed in Study 1; (3) Scale up the ratings with algorithmic methods using an LLM, as shown with our algorithmic democratic attitude feed in Studies 2 & 3.
  • Figure 2: Summary of our seven feed ranking conditions. The democratic attitude feeds incorporate our anti-democratic attitude model with either: (1) Downranking, (2) Content Warning, or (3) Remove-and-Replace feeds. The comparison feeds capture a range of existing feed ranking methods including (4) Engagement-Based, (5) Ideologically Balanced, or (6) Chronological feeds as well as (7) a Null Control where no feed is shown.
  • Figure 3: Website Interface of Democratic Attitude Feeds. Participants in different feed conditions were exposed to different interfaces. (Left) Example posts towards the top of the Downranking condition where anti-democratic information is ranked towards the bottom of the feed. (Center) Example posts in the Remove-and-Replace feed in which anti-democratic posts are replaced with pro-democratic posts. (Right) Example posts in the Content Warning feed where anti-democratic posts are blurred with content warnings and users must click the post to see the information.
  • Figure 4: Means of Partisan Animosity Across Conditions (Divided by Parties). We found that democratic attitude feeds (in purple)---specifically the downranking feed and remove-and-replace feed --- caused significantly less partisan animosity than the engagement feed (in dark grey).
  • Figure 5: Means of Feed-level Satisfaction Across Conditions (Divided by Parties). We found that Democrats exposed to democratic attitude feeds (in purple)---specifically the downranking and removal feed --- had significantly higher feed-level satisfaction than those exposed to the engagement-based feed. Republicans exposed to the removal feed had significantly higher satisfaction than those in the engagement-based feed.
  • ...and 2 more figures