Collaborative QA using Interacting LLMs. Impact of Network Structure, Node Capability and Distributed Data
Adit Jain, Vikram Krishnamurthy, Yiming Zhang
TL;DR
The paper addresses how networks of interacting LLMs perform CollaborativeQA and how hallucinations propagate through network structure. It introduces a coupled mean-field dynamics (MFD) and randomized utility model (RUM) to model information diffusion and LLM decision-making, yielding an analytically tractable ODE framework and a data-driven transition kernel. Theoretical results establish fixed-point existence, contraction, and monotone comparative statics in the incentive, while extensive experiments with 100 open-source LLMs demonstrate that computation, data placement, topology (notably power-law networks), and higher-capability nodes significantly improve the truthful fraction ρ_T. The work provides a principled lens for designing robust, scalable LLM networks for CQA with implications for privacy, fault tolerance, and distributed reasoning in real-world settings.
Abstract
In this paper, we model and analyze how a network of interacting LLMs performs collaborative question-answering (CQA) in order to estimate a ground truth given a distributed set of documents. This problem is interesting because LLMs often hallucinate when direct evidence to answer a question is lacking, and these effects become more pronounced in a network of interacting LLMs. The hallucination spreads, causing previously accurate LLMs to hallucinate. We study interacting LLMs and their hallucination by combining novel ideas of mean-field dynamics (MFD) from network science and the randomized utility model from economics to construct a useful generative model. We model the LLM with a latent state that indicates if it is truthful or not with respect to the ground truth, and extend a tractable analytical model considering an MFD to model the diffusion of information in a directed network of LLMs. To specify the probabilities that govern the dynamics of the MFD, we propose a randomized utility model. For a network of LLMs, where each LLM has two possible latent states, we posit sufficient conditions for the existence and uniqueness of a fixed point and analyze the behavior of the fixed point in terms of the incentive (e.g., test-time compute) given to individual LLMs. We experimentally study and analyze the behavior of a network of $100$ open-source LLMs with respect to data heterogeneity, node capability, network structure, and sensitivity to framing on multiple semi-synthetic datasets.
