Evidence of a log scaling law for political persuasion with large language models
Kobi Hackenburg, Ben M. Tappin, Paul Röttger, Scott Hale, Jonathan Bright, Helen Margetts
TL;DR
The paper investigates whether static political messages generated by large language models become more persuasive as model size increases. It analyzes 720 messages produced by 24 open-weight LLMs plus two frontier closed models, delivered to 25,982 U.S. adults in a preregistered randomized survey, with model size measured by active parameters. A random-effects meta-analysis shows a log scaling law: persuasiveness increases with $\log(\text{parameter count})$ such that a one-unit increase in $\log(\text{parameters})$ raises the average treatment effect by $1.26$ percentage points, and the intercept at average size is $5.77$ percentage points, while frontier models are only modestly more persuasive than much smaller models. Importantly, when adjusting for task completion (coherence and staying on topic), model size no longer predicts persuasiveness, suggesting a ceiling on gains from scaling for static messages. These findings imply policy-relevant risk assessments, indicating that near-term persuasiveness may plateau and that improvements in static messaging may rely more on task-quality than sheer size, though multi-turn or fine-tuned approaches could still yield higher impact.
Abstract
Large language models can now generate political messages as persuasive as those written by humans, raising concerns about how far this persuasiveness may continue to increase with model size. Here, we generate 720 persuasive messages on 10 U.S. political issues from 24 language models spanning several orders of magnitude in size. We then deploy these messages in a large-scale randomized survey experiment (N = 25,982) to estimate the persuasive capability of each model. Our findings are twofold. First, we find evidence of a log scaling law: model persuasiveness is characterized by sharply diminishing returns, such that current frontier models are barely more persuasive than models smaller in size by an order of magnitude or more. Second, mere task completion (coherence, staying on topic) appears to account for larger models' persuasive advantage. These findings suggest that further scaling model size will not much increase the persuasiveness of static LLM-generated messages.
