Tracking the perspectives of interacting language models
Hayden Helm, Brandon Duderstadt, Youngser Park, Carey E. Priebe
TL;DR
This work models systems of interacting LLMs as a directed graph $G=(V,E)$ with vertices $V=\mathcal{F}\cup\mathcal{D}$ and time-varying edges $E^{(t)}$ to study information diffusion. It introduces a perspective space built from a surrogate data kernel and CMDS to quantify model-wise differences in responses to a fixed prompt set $\mathbf{X}$, enabling comparative analyses across heterogeneous models. Three case studies illustrate how different communication structures drive phenomena such as global and local sinks, adversarial influence and diffusion, and cross-class polarization, with metrics including iso-mirror, ARI, and polarization. The approach provides a quantitative framework for analyzing AI ecosystems and their analogs in human-model forums, offering insights into interventions and system health while acknowledging simplifications and avenues for broader sociotechnical validation.
Abstract
Large language models (LLMs) are capable of producing high quality information at unprecedented rates. As these models continue to entrench themselves in society, the content they produce will become increasingly pervasive in databases that are, in turn, incorporated into the pre-training data, fine-tuning data, retrieval data, etc. of other language models. In this paper we formalize the idea of a communication network of LLMs and introduce a method for representing the perspective of individual models within a collection of LLMs. Given these tools we systematically study information diffusion in the communication network of LLMs in various simulated settings.
