Exploring and Controlling Diversity in LLM-Agent Conversation
KuanChao Chu, Yi-Pei Chen, Hideki Nakayama
TL;DR
This paper addresses controlling dialogue diversity in LLM-based multi-agent simulations by introducing Adaptive Prompt Pruning (APP), a removal-based method that modulates prompt content through a single knob $\lambda$ based on attention-derived unit scores. It reveals that prompt components—especially Memory—strongly constrain diversity and that high-attention content tends to suppress variation; APP can be combined with existing diversity techniques and a post-generation revision step to balance diversity with fidelity. Extensive experiments on GA and HA datasets with LLaMA-based backbones demonstrate that APP effectively increases diversity across settings, while the revision process reduces inconsistencies introduced by pruning. The work also highlights how prompt structure, block order, and parametric knowledge interact to shape diversity, offering practical guidance for constructing flexible, robust multi-agent simulations.
Abstract
Controlling diversity in LLM-agent simulations is essential for balancing stability in structured tasks with variability in open-ended interactions. However, we observe that dialogue diversity tends to degrade over long-term simulations. To explore the role of prompt design in this phenomenon, we modularized the utterance generation prompt and found that reducing contextual information leads to more diverse outputs. Based on this insight, we propose Adaptive Prompt Pruning (APP), a novel method that allows users to control diversity via a single parameter, lambda. APP dynamically prunes prompt segments based on attention scores and is compatible with existing diversity control methods. We demonstrate that APP effectively modulates diversity through extensive experiments and propose a method to balance the control trade-offs. Our analysis reveals that all prompt components impose constraints on diversity, with the Memory being the most influential. Additionally, high-attention contents consistently suppress output diversity.
