AIDG: Evaluating Asymmetry Between Information Extraction and Containment in Multi-Turn Dialogue

Adib Sakhawat; Fardeen Sadab; Rakin Shahriar

AIDG: Evaluating Asymmetry Between Information Extraction and Containment in Multi-Turn Dialogue

Adib Sakhawat, Fardeen Sadab, Rakin Shahriar

TL;DR

AIDG (Adversarial Information Deduction Game), a game-theoretic framework that probes the asymmetry between information extraction and information containment in dialogue, suggests that while LLMs excel at local defensive coherence, they struggle with the global state tracking required for strategic inquiry.

Abstract

Evaluating the strategic reasoning capabilities of Large Language Models (LLMs) requires moving beyond static benchmarks to dynamic, multi-turn interactions. We introduce AIDG (Adversarial Information Deduction Game), a game-theoretic framework that probes the asymmetry between information extraction (active deduction) and information containment (state maintenance) in dialogue. We propose two complementary tasks: AIDG-I, measuring pragmatic strategy in social deduction, and AIDG-II, measuring constraint satisfaction in a structured "20 Questions" setting. Across 439 games with six frontier LLMs, we observe a clear capability asymmetry: models perform substantially better at containment than deduction, with a 350 ELO advantage on defense;(Cohen's d = 5.47). We identify two bottlenecks driving this gap: (1) Information Dynamics, where confirmation strategies are 7.75x more effective than blind deduction (p < 0.00001), and (2) Constraint Adherence, where instruction-following degrades under conversational load, accounting for 41.3% of deductive failures. These findings suggest that while LLMs excel at local defensive coherence, they struggle with the global state tracking required for strategic inquiry.

AIDG: Evaluating Asymmetry Between Information Extraction and Containment in Multi-Turn Dialogue

TL;DR

Abstract

Paper Structure (89 sections, 16 equations, 1 figure, 13 tables)

This paper contains 89 sections, 16 equations, 1 figure, 13 tables.

Introduction
Related Work
Sequential Dependencies in Multi-Turn Dialogue.
Interactive Capability Evaluation.
Constraint Satisfaction Under Load.
Strategic Reasoning and Game Theory.
Pragmatic Leakage Detection.
Comparative Capability Modeling.
Methodology: The AIDG Framework
Design Principles
AIDG-I: Pragmatic State Maintenance
Information Corpus.
Strategic Modalities.
Leak Detection Logic.
AIDG-II: Constrained Deductive Search
...and 74 more sections

Figures (1)

Figure 1: Overview of the AIDG framework. The pipeline consists of four stages: (1) Initialization (model selection, role assignment, secret sampling), (2) Multi-turn interaction between Seeker and Holder, (3) Arbiter-based adjudication of leakage or correct lock, and (4) Outcome computation with Dual-ELO updates and efficiency weighting.

AIDG: Evaluating Asymmetry Between Information Extraction and Containment in Multi-Turn Dialogue

TL;DR

Abstract

AIDG: Evaluating Asymmetry Between Information Extraction and Containment in Multi-Turn Dialogue

Authors

TL;DR

Abstract

Table of Contents

Figures (1)