Whom to Query for What: Adaptive Group Elicitation via Multi-Turn LLM Interactions
Ruomeng Ding, Tianwei Gao, Thomas P. Zollo, Eitan Bachmat, Richard Zemel, Zhun Deng
TL;DR
This paper tackles the problem of inferring latent population-level properties under costly, partial data collection by introducing adaptive group elicitation. It jointly optimizes which questions to ask and which respondents to query in multi-turn interactions, guided by an LLM-based expected information gain objective and enhanced by a heterogeneous GNN that propagates relational information and imputes missing responses. Theoretical results establish near-optimality of greedy strategies and connect predictive inference to a generalized de Finetti framework, while experiments on three real-world datasets show substantial gains in population-level accuracy and calibration under constrained budgets, with pronounced benefits from targeting highly sensitive respondents. The approach demonstrates robust gains across model scales, regional splits, and budget levels, underscoring its practicality for efficient, scalable survey and collective assessment design.
Abstract
Eliciting information to reduce uncertainty about latent group-level properties from surveys and other collective assessments requires allocating limited questioning effort under real costs and missing data. Although large language models enable adaptive, multi-turn interactions in natural language, most existing elicitation methods optimize what to ask with a fixed respondent pool, and do not adapt respondent selection or leverage population structure when responses are partial or incomplete. To address this gap, we study adaptive group elicitation, a multi-round setting where an agent adaptively selects both questions and respondents under explicit query and participation budgets. We propose a theoretically grounded framework that combines (i) an LLM-based expected information gain objective for scoring candidate questions with (ii) heterogeneous graph neural network propagation that aggregates observed responses and participant attributes to impute missing responses and guide per-round respondent selection. This closed-loop procedure queries a small, informative subset of individuals while inferring population-level responses via structured similarity. Across three real-world opinion datasets, our method consistently improves population-level response prediction under constrained budgets, including a >12% relative gain on CES at a 10% respondent budget.
