Table of Contents
Fetching ...

Question the Questions: Auditing Representation in Online Deliberative Processes

Soham De, Lodewijk Gelauff, Ashish Goel, Smitha Milli, Ariel Procaccia, Alice Siu

TL;DR

This work introduces a principled auditing framework for representation in online deliberation, grounded in justified representation (JR) and its quantitative $\alpha$-JR variant, to assess how well a small slate of questions represents a larger participant pool. Utilities for arbitrary questions are inferred via embedding-based cosine similarity, enabling evaluation of both participant-proposed and AI-generated (extractive and abstractive) slates. The authors present two auditing algorithms, including a scalable single-pass method with runtime $O(mn\log n)$, and apply them to historical deliberations to compare moderator-selected, integer-program optimized extractive slates, and LLM-generated abstractive slates. Empirical results show algorithmic slates often improve representativeness over historical moderator selections, with abstractive generation offering additional benefits in many contexts, while also highlighting limitations and the need for auditing in AI-assisted deliberation. The framework is integrated into an online deliberation platform to facilitate practical adoption, enabling practitioners to audit and enhance representation in future deliberations across diverse settings.

Abstract

A central feature of many deliberative processes, such as citizens' assemblies and deliberative polls, is the opportunity for participants to engage directly with experts. While participants are typically invited to propose questions for expert panels, only a limited number can be selected due to time constraints. This raises the challenge of how to choose a small set of questions that best represent the interests of all participants. We introduce an auditing framework for measuring the level of representation provided by a slate of questions, based on the social choice concept known as justified representation (JR). We present the first algorithms for auditing JR in the general utility setting, with our most efficient algorithm achieving a runtime of $O(mn\log n)$, where $n$ is the number of participants and $m$ is the number of proposed questions. We apply our auditing methods to historical deliberations, comparing the representativeness of (a) the actual questions posed to the expert panel (chosen by a moderator), (b) participants' questions chosen via integer linear programming, (c) summary questions generated by large language models (LLMs). Our results highlight both the promise and current limitations of LLMs in supporting deliberative processes. By integrating our methods into an online deliberation platform that has been used for over hundreds of deliberations across more than 50 countries, we make it easy for practitioners to audit and improve representation in future deliberations.

Question the Questions: Auditing Representation in Online Deliberative Processes

TL;DR

This work introduces a principled auditing framework for representation in online deliberation, grounded in justified representation (JR) and its quantitative -JR variant, to assess how well a small slate of questions represents a larger participant pool. Utilities for arbitrary questions are inferred via embedding-based cosine similarity, enabling evaluation of both participant-proposed and AI-generated (extractive and abstractive) slates. The authors present two auditing algorithms, including a scalable single-pass method with runtime , and apply them to historical deliberations to compare moderator-selected, integer-program optimized extractive slates, and LLM-generated abstractive slates. Empirical results show algorithmic slates often improve representativeness over historical moderator selections, with abstractive generation offering additional benefits in many contexts, while also highlighting limitations and the need for auditing in AI-assisted deliberation. The framework is integrated into an online deliberation platform to facilitate practical adoption, enabling practitioners to audit and enhance representation in future deliberations across diverse settings.

Abstract

A central feature of many deliberative processes, such as citizens' assemblies and deliberative polls, is the opportunity for participants to engage directly with experts. While participants are typically invited to propose questions for expert panels, only a limited number can be selected due to time constraints. This raises the challenge of how to choose a small set of questions that best represent the interests of all participants. We introduce an auditing framework for measuring the level of representation provided by a slate of questions, based on the social choice concept known as justified representation (JR). We present the first algorithms for auditing JR in the general utility setting, with our most efficient algorithm achieving a runtime of , where is the number of participants and is the number of proposed questions. We apply our auditing methods to historical deliberations, comparing the representativeness of (a) the actual questions posed to the expert panel (chosen by a moderator), (b) participants' questions chosen via integer linear programming, (c) summary questions generated by large language models (LLMs). Our results highlight both the promise and current limitations of LLMs in supporting deliberative processes. By integrating our methods into an online deliberation platform that has been used for over hundreds of deliberations across more than 50 countries, we make it easy for practitioners to audit and improve representation in future deliberations.

Paper Structure

This paper contains 18 sections, 2 equations, 3 figures, 2 tables, 2 algorithms.

Figures (3)

  • Figure 1: ROC curves comparing the binary classification accuracy of different embedding models on the Quora Question Pairs (QQP) dataset quora_question_pairs. Each curve is obtained by thresholding the cosine similarity between the embeddings of paired questions.
  • Figure 2: Cross-validating audit outcomes across embedding models. Each heatmap shows the JR-value when slates optimized using an integer program on one embedding model are evaluated using another model. Rows indicate the model used for evaluation, and columns indicate the model used for optimization (via the IP). Lower JR-values indicate greater consistency between optimization and evaluation models. Any value below 1 implies that the slate satisfies JR.
  • Figure 3: Screenshots illustrating our approach implemented in the online deliberation platform. The moderator can generate LLM summary questions (referred to as "SuperQuestions" in the interface) from participant-proposed questions; view which participant-proposed questions are most similar to each LLM-generated one; and export all data including, similarity scores, for representation auditing.

Theorems & Definitions (2)

  • Definition 1: JR with utilities
  • Definition 2: $\alpha$-JR with utilities