Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

Yusen Zhang; Ruoxi Sun; Yanfei Chen; Tomas Pfister; Rui Zhang; Sercan Ö. Arik

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, Sercan Ö. Arik

TL;DR

Chain-of-Agents (CoA) introduces a training-free, multi-agent framework to tackle long-context tasks by organizing worker LLMs to sequentially read and reason over text chunks, with a dedicated manager integrator producing the final answer. By interleaving reading and reasoning and constraining each worker to a small context, CoA expands effective coverage of the full input without requiring window extension. Across nine long-context datasets spanning QA, summarization, and code completion, CoA consistently outperforms both input-reduction (RAG) and full-context baselines, and even surpasses some ultra-long-context LLMs in certain settings. Analyses show CoA mitigates the lost-in-the-middle effect, scales with input length, and benefits from multi-path ensembles, highlighting its practical potential for complex, long-form tasks.

Abstract

Addressing the challenge of effectively processing long contexts has become a critical issue for Large Language Models (LLMs). Two common strategies have emerged: 1) reducing the input length, such as retrieving relevant chunks by Retrieval-Augmented Generation (RAG), and 2) expanding the context window limit of LLMs. However, both strategies have drawbacks: input reduction has no guarantee of covering the part with needed information, while window extension struggles with focusing on the pertinent information for solving the task. To mitigate these limitations, we propose Chain-of-Agents (CoA), a novel framework that harnesses multi-agent collaboration through natural language to enable information aggregation and context reasoning across various LLMs over long-context tasks. CoA consists of multiple worker agents who sequentially communicate to handle different segmented portions of the text, followed by a manager agent who synthesizes these contributions into a coherent final output. CoA processes the entire input by interleaving reading and reasoning, and it mitigates long context focus issues by assigning each agent a short context. We perform comprehensive evaluation of CoA on a wide range of long-context tasks in question answering, summarization, and code completion, demonstrating significant improvements by up to 10% over strong baselines of RAG, Full-Context, and multi-agent LLMs.

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

TL;DR

Abstract

Paper Structure (34 sections, 6 equations, 7 figures, 13 tables, 1 algorithm)

This paper contains 34 sections, 6 equations, 7 figures, 13 tables, 1 algorithm.

Introduction
Related work
Multi-agent LLMs.
Long Context Modeling for LLMs.
Complex Task Reasoning.
Method
Stage 1: Worker Agent: Segment Comprehension and Chain-Communication
Stage 2: Manager Agent: Information Integration and Response Generation
Time Complexity Analysis
Experiment
Experiment Setup
Datasets.
Metrics.
LLMs.
Baselines.
...and 19 more sections

Figures (7)

Figure 1: Overview of Chain-of-Agents, a training free, task agnostic, and highly-interpretable framework that harnesses multi-agent collaboration for long-context tasks. It consists of multiple worker agents who sequentially communicate to handle different segmented portions of the text, followed by a manager agent who synthesizes these contributions into a coherent final output.
Figure 2: Performance of Claude 3 on BookSum. Improvement is more obvious for longer inputs.
Figure 3: Comparison on NarrativeQA. X-axis/Y-axis indicate RAG/CoA performance while each point represents a bin. The number indicates the chunk index of gold answer (ratio of number of samples in bracket), and the size of the point indicates the improvement of CoA over RAG.
Figure 4: Performance of CoA and Full on Natural Questions. CoA mitigates the lost-in-the-middle issue. X-axis is the index of document with gold answer where small number indicates gold answer is closer to start.
Figure 5: A case study of RAG (left) and CoA (right) on HotpotQA. The sequential agent communication enables CoA to perform complex multi-hop reasoning over long contexts.
...and 2 more figures

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

TL;DR

Abstract

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

Authors

TL;DR

Abstract

Table of Contents

Figures (7)