Table of Contents
Fetching ...

Enabling Agents to Communicate Entirely in Latent Space

Zhuoyun Du, Runze Wang, Huiyu Bai, Zouying Cao, Xiaoyong Zhu, Bo Zheng, Wei Chen, Haochao Ying

TL;DR

The paper addresses the bottleneck of language-based inter-agent communication in LLM-driven systems by proposing Interlat, which transmits entire latent representations (last-layer hidden states) between agents and applies explicit compression. It introduces a training framework with conditional mind separation and plan-alignment losses, plus a latent-space compression stage to produce concise yet information-rich messages. Across ALFWorld experiments, Interlat improves task success and promotes exploratory, multi-path reasoning, while reducing communication latency up to about 24× with minimal performance loss. This work demonstrates the feasibility and benefits of fully latent-space inter-agent communication, offering practical guidance for building more efficient and capable multi-agent systems.

Abstract

While natural language is the de facto communication medium for LLM-based agents, it presents a fundamental constraint. The process of downsampling rich, internal latent states into discrete tokens inherently limits the depth and nuance of information that can be transmitted, thereby hindering collaborative problem-solving. Inspired by human mind-reading, we propose Interlat (Inter-agent Latent Space Communication), a paradigm that leverages the last hidden states of an LLM as a representation of its mind for direct transmission (termed latent communication). An additional compression process further compresses latent communication via entirely latent space reasoning. Experiments demonstrate that Interlat outperforms both fine-tuned chain-of-thought (CoT) prompting and single-agent baselines, promoting more exploratory behavior and enabling genuine utilization of latent information. Further compression not only substantially accelerates inference but also maintains competitive performance through an efficient information-preserving mechanism. We position this work as a feasibility study of entirely latent space inter-agent communication, and our results highlight its potential, offering valuable insights for future research.

Enabling Agents to Communicate Entirely in Latent Space

TL;DR

The paper addresses the bottleneck of language-based inter-agent communication in LLM-driven systems by proposing Interlat, which transmits entire latent representations (last-layer hidden states) between agents and applies explicit compression. It introduces a training framework with conditional mind separation and plan-alignment losses, plus a latent-space compression stage to produce concise yet information-rich messages. Across ALFWorld experiments, Interlat improves task success and promotes exploratory, multi-path reasoning, while reducing communication latency up to about 24× with minimal performance loss. This work demonstrates the feasibility and benefits of fully latent-space inter-agent communication, offering practical guidance for building more efficient and capable multi-agent systems.

Abstract

While natural language is the de facto communication medium for LLM-based agents, it presents a fundamental constraint. The process of downsampling rich, internal latent states into discrete tokens inherently limits the depth and nuance of information that can be transmitted, thereby hindering collaborative problem-solving. Inspired by human mind-reading, we propose Interlat (Inter-agent Latent Space Communication), a paradigm that leverages the last hidden states of an LLM as a representation of its mind for direct transmission (termed latent communication). An additional compression process further compresses latent communication via entirely latent space reasoning. Experiments demonstrate that Interlat outperforms both fine-tuned chain-of-thought (CoT) prompting and single-agent baselines, promoting more exploratory behavior and enabling genuine utilization of latent information. Further compression not only substantially accelerates inference but also maintains competitive performance through an efficient information-preserving mechanism. We position this work as a feasibility study of entirely latent space inter-agent communication, and our results highlight its potential, offering valuable insights for future research.

Paper Structure

This paper contains 34 sections, 15 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: A comparison of Interlat with conventional language-space communication.In language space, an agent transmits a discrete token sequence $[x_i, x_{i+1}, \dots, x_{i+j+1}]$ (e.g., a CoT plan) to another. In Interlat, the model leverages its last hidden states as a representation of its internal “mind” state, processed by a communication adapter, and then transmits them directly to the other agent, enabling communication entirely in latent space with higher expressive capacity.
  • Figure 2: Training the reasoning model with frozen-actor supervision.
  • Figure 3: Training dynamics of the cross-entropy loss and separation loss: an initial plateau near 0.69 indicates no separation between matched/mismatched latents, followed by a sharp drop after $\sim2.2$k steps, marking the model’s “aha” moment in exploiting task-relevant latent information.
  • Figure 4: Analysis of parallelism in latent communication across the first six steps. The red denotes latents from the trained model, and the gray is the untrained model. Trained latents retain stable vertical gaps between successive Top-$k$ bands and exhibit markedly lower $P_{50}(S_{10})$, indicating persistent parallelism, whereas the untrained latents progressively collapse toward Top-1.
  • Figure 5: The task-averaged relative change $\Delta\mathrm{CE}$ and the relative saving rates before and after the compression training process.
  • ...and 4 more figures