PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs

Roberto Cerina

PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs

Roberto Cerina

TL;DR

PoSSUM presents an end-to-end protocol to unobtrusively poll social-media users through multimodal LLMs by constructing silicon samples from real-time digital traces. It couples modular prompting and filters with Multilevel Regression and Post-Stratification (MrP) using structured priors to correct for platform-driven Selection Bias and produce population- and sub-population estimates. Validated in the 2024 US Presidential election, PoSSUM achieved state-level predictive accuracy and demonstrated novel learning while revealing limitations in third-party coverage and time-sensitivity due to benchmark noise. The approach offers a fully automated, scalable alternative to traditional surveys, contingent on careful bias mitigation and cross-platform data integration to ensure robust, timely public-opinion insights.

Abstract

This paper introduces PoSSUM, an open-source protocol for unobtrusive polling of social-media users via multimodal Large Language Models (LLMs). PoSSUM leverages users' real-time posts, images, and other digital traces to create silicon samples that capture information not present in the LLM's training data. To obtain representative estimates, PoSSUM employs Multilevel Regression and Post-Stratification (MrP) with structured priors to counteract the observable selection biases of social-media platforms. The protocol is validated during the 2024 U.S. Presidential Election, for which five PoSSUM polls were conducted and published on GitHub and X. In the final poll, fielded October 17-26 with a synthetic sample of 1,054 X users, PoSSUM accurately predicted the outcomes in 50 of 51 states and assigned the Republican candidate a win probability of 0.65. Notably, it also exhibited lower state-level bias than most established pollsters. These results demonstrate PoSSUM's potential as a fully automated, unobtrusive alternative to traditional survey methods.

PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs

TL;DR

Abstract

PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (32)