Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation
Scott Merrill, Shashank Srivastava
TL;DR
The paper tackles the challenge of realistically simulating multi-party civic deliberations by generating speaker-attributed transcripts from public Zoom recordings and enriching them with action-aware metadata. It introduces a multimodal speaker-linking pipeline and a structured metadata framework (persona profiles, topics, action tags) to fine-tune LLMs via parameter-efficient methods, yielding substantial improvements in $PPL$, $CFR$, and $SAA$. The authors release three datasets of speaker-labeled government deliberations and evaluate realism through both automatic metrics and human-based Turing tests, finding simulations often indistinguishable from real dialogues. Temporal grounding and carefully designed prompts further enhance topic coverage and goal coherence in simulations, enabling scalable counterfactual analyses of institutional decision-making. Collectively, this work provides a reproducible pipeline and datasets that advance realistic, persona-aware civic simulations for policy analysis, training, and civic tech research.
Abstract
Large language models offer opportunities to simulate multi-party deliberation, but realistic modeling remains limited by a lack of speaker-attributed data. Transcripts produced via automatic speech recognition (ASR) assign anonymous speaker labels (e.g., Speaker_1), preventing models from capturing consistent human behavior. This work introduces a reproducible pipeline to transform public Zoom recordings into speaker-attributed transcripts with metadata like persona profiles and pragmatic action tags (e.g., [propose_motion]). We release three local government deliberation datasets: Appellate Court hearings, School Board meetings, and Municipal Council sessions. Fine-tuning LLMs to model specific participants using this "action-aware" data produces a 67% reduction in perplexity and nearly doubles classifier-based performance metrics for speaker fidelity and realism. Turing-style human evaluations show our simulations are often indistinguishable from real deliberations, providing a practical and scalable method for complex realistic civic simulations.
