Table of Contents
Fetching ...

Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation

Scott Merrill, Shashank Srivastava

TL;DR

The paper tackles the challenge of realistically simulating multi-party civic deliberations by generating speaker-attributed transcripts from public Zoom recordings and enriching them with action-aware metadata. It introduces a multimodal speaker-linking pipeline and a structured metadata framework (persona profiles, topics, action tags) to fine-tune LLMs via parameter-efficient methods, yielding substantial improvements in $PPL$, $CFR$, and $SAA$. The authors release three datasets of speaker-labeled government deliberations and evaluate realism through both automatic metrics and human-based Turing tests, finding simulations often indistinguishable from real dialogues. Temporal grounding and carefully designed prompts further enhance topic coverage and goal coherence in simulations, enabling scalable counterfactual analyses of institutional decision-making. Collectively, this work provides a reproducible pipeline and datasets that advance realistic, persona-aware civic simulations for policy analysis, training, and civic tech research.

Abstract

Large language models offer opportunities to simulate multi-party deliberation, but realistic modeling remains limited by a lack of speaker-attributed data. Transcripts produced via automatic speech recognition (ASR) assign anonymous speaker labels (e.g., Speaker_1), preventing models from capturing consistent human behavior. This work introduces a reproducible pipeline to transform public Zoom recordings into speaker-attributed transcripts with metadata like persona profiles and pragmatic action tags (e.g., [propose_motion]). We release three local government deliberation datasets: Appellate Court hearings, School Board meetings, and Municipal Council sessions. Fine-tuning LLMs to model specific participants using this "action-aware" data produces a 67% reduction in perplexity and nearly doubles classifier-based performance metrics for speaker fidelity and realism. Turing-style human evaluations show our simulations are often indistinguishable from real deliberations, providing a practical and scalable method for complex realistic civic simulations.

Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation

TL;DR

The paper tackles the challenge of realistically simulating multi-party civic deliberations by generating speaker-attributed transcripts from public Zoom recordings and enriching them with action-aware metadata. It introduces a multimodal speaker-linking pipeline and a structured metadata framework (persona profiles, topics, action tags) to fine-tune LLMs via parameter-efficient methods, yielding substantial improvements in , , and . The authors release three datasets of speaker-labeled government deliberations and evaluate realism through both automatic metrics and human-based Turing tests, finding simulations often indistinguishable from real dialogues. Temporal grounding and carefully designed prompts further enhance topic coverage and goal coherence in simulations, enabling scalable counterfactual analyses of institutional decision-making. Collectively, this work provides a reproducible pipeline and datasets that advance realistic, persona-aware civic simulations for policy analysis, training, and civic tech research.

Abstract

Large language models offer opportunities to simulate multi-party deliberation, but realistic modeling remains limited by a lack of speaker-attributed data. Transcripts produced via automatic speech recognition (ASR) assign anonymous speaker labels (e.g., Speaker_1), preventing models from capturing consistent human behavior. This work introduces a reproducible pipeline to transform public Zoom recordings into speaker-attributed transcripts with metadata like persona profiles and pragmatic action tags (e.g., [propose_motion]). We release three local government deliberation datasets: Appellate Court hearings, School Board meetings, and Municipal Council sessions. Fine-tuning LLMs to model specific participants using this "action-aware" data produces a 67% reduction in perplexity and nearly doubles classifier-based performance metrics for speaker fidelity and realism. Turing-style human evaluations show our simulations are often indistinguishable from real deliberations, providing a practical and scalable method for complex realistic civic simulations.

Paper Structure

This paper contains 62 sections, 1 equation, 18 figures, 8 tables.

Figures (18)

  • Figure 1: Public Zoom recordings are converted into speaker-attributed transcripts using a speaker-linking pipeline that aligns anonymous diarization labels with persistent real-world identities across videos. From these transcripts, we automatically extract structured metadata: topics, action tags, and speaker profiles; which condition PEFT fine-tuned LLMs for persona modeling. We evaluate these LLM personas on perplexity, classifier fool rate, and speaker attribution accuracy.
  • Figure 2: Active speakers are identified by using the highlighted speaker tile provided by Zoom (1). From this detected speaker tile, the name-region is extracted (2) and processed with OCR to obtain the participant's identity.
  • Figure 3: Summary of model performance across system message configurations. The plots report perplexity (lower is better), fool rate (higher is better), and speaker attribution accuracy(higher is better), with values averaged over all datasets and all model families. Error bars denote the standard error of the mean.
  • Figure 4: (a) The Albemarle County School Board simulation follows typical meeting procedures, including a moment of silence, mandated statements, roll call, and participant introductions. (b) The DC Court of Appeals simulation follows a structured order with opening statements, questioning, and verification of representation.
  • Figure 5: Two-stage prompting strategy for persona extraction. Stage 1 uses a schema-based GPT-5 prompt to extract structured communicative traits from each speaker’s longest monologues. Stage 2 merges these traits into a coherent, natural-language persona summary that captures tone, boundaries, and priorities, enabling persona-aware modeling in deliberative settings.
  • ...and 13 more figures