Exploring and Controlling Diversity in LLM-Agent Conversation

KuanChao Chu; Yi-Pei Chen; Hideki Nakayama

Exploring and Controlling Diversity in LLM-Agent Conversation

KuanChao Chu, Yi-Pei Chen, Hideki Nakayama

TL;DR

This paper addresses controlling dialogue diversity in LLM-based multi-agent simulations by introducing Adaptive Prompt Pruning (APP), a removal-based method that modulates prompt content through a single knob $\lambda$ based on attention-derived unit scores. It reveals that prompt components—especially Memory—strongly constrain diversity and that high-attention content tends to suppress variation; APP can be combined with existing diversity techniques and a post-generation revision step to balance diversity with fidelity. Extensive experiments on GA and HA datasets with LLaMA-based backbones demonstrate that APP effectively increases diversity across settings, while the revision process reduces inconsistencies introduced by pruning. The work also highlights how prompt structure, block order, and parametric knowledge interact to shape diversity, offering practical guidance for constructing flexible, robust multi-agent simulations.

Abstract

Controlling diversity in LLM-agent simulations is essential for balancing stability in structured tasks with variability in open-ended interactions. However, we observe that dialogue diversity tends to degrade over long-term simulations. To explore the role of prompt design in this phenomenon, we modularized the utterance generation prompt and found that reducing contextual information leads to more diverse outputs. Based on this insight, we propose Adaptive Prompt Pruning (APP), a novel method that allows users to control diversity via a single parameter, lambda. APP dynamically prunes prompt segments based on attention scores and is compatible with existing diversity control methods. We demonstrate that APP effectively modulates diversity through extensive experiments and propose a method to balance the control trade-offs. Our analysis reveals that all prompt components impose constraints on diversity, with the Memory being the most influential. Additionally, high-attention contents consistently suppress output diversity.

Exploring and Controlling Diversity in LLM-Agent Conversation

TL;DR

based on attention-derived unit scores. It reveals that prompt components—especially Memory—strongly constrain diversity and that high-attention content tends to suppress variation; APP can be combined with existing diversity techniques and a post-generation revision step to balance diversity with fidelity. Extensive experiments on GA and HA datasets with LLaMA-based backbones demonstrate that APP effectively increases diversity across settings, while the revision process reduces inconsistencies introduced by pruning. The work also highlights how prompt structure, block order, and parametric knowledge interact to shape diversity, offering practical guidance for constructing flexible, robust multi-agent simulations.

Abstract

Paper Structure (38 sections, 1 equation, 10 figures, 12 tables, 1 algorithm)

This paper contains 38 sections, 1 equation, 10 figures, 12 tables, 1 algorithm.

Introduction
Data, Model, and Task for Diversity Evaluation
Data
Model
Task
Adaptive Prompt Pruning
Method
Discussion
Main Results
Post-removal Attention Scores
Retain-1 Analysis
Balancing Diversity Trade-off
Method
Discussion
Comparing Diversity Approaches
...and 23 more sections

Figures (10)

Figure 1: Diversity control in LLM-agent conversations. By increasing $\lambda$, more components are removed from the prompt, selected by their attention scores, thereby enhancing the diversity of the dialogue content.
Figure 2: Diversity decreases over time when using the full prompt (Full). Removing memory and previous dialogues from the prompt (RMmp) alleviates this issue.
Figure 3: Dialogue diversity under our control parameter $\lambda$. As $\lambda$ increases from 0 to 1, diversity generally increases. Removing units based on attention scores in descending order (default) is more word-efficient than removing them in ascending order (asc). Annotated numbers in (a) represent diversity at the endpoints.
Figure 4: Results for $\lambda$ vs. diversity under different model and data settings. Similar trends are observed as in the LLaMA 3, GA setting, despite differences in initial diversity. Annotated numbers indicate diversity at the endpoints.
Figure 5: More analysis of Adaptive Prompt Pruning discussed in Section \ref{['sec:app_discuss']}.
...and 5 more figures

Exploring and Controlling Diversity in LLM-Agent Conversation

TL;DR

Abstract

Exploring and Controlling Diversity in LLM-Agent Conversation

Authors

TL;DR

Abstract

Table of Contents

Figures (10)