Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning
Song Yu, Xiaofei Xu, Ke Deng, Li Li, Lin Tian
TL;DR
TOA tackles long-context challenges in LLMs by enabling multiple agents to read document chunks along different treeed paths, fostering multi-perspective reasoning. The framework comprises three phases—chunk perception, multi-perspective understanding, and consensus formation—augmented by prefix-hash caching and adaptive pruning to reduce redundancy and compute. Empirical results on DetectiveQA and NovelQA show that TOA with a lightweight 8B model outperforms several baselines and rivals larger commercial models on long-context tasks, while maintaining a low none-rate. The findings demonstrate that reading order and cross-agent collaboration can mitigate position bias and hallucinations, enabling efficient and robust long-context understanding with practical deployment implications.
Abstract
Large language models (LLMs) face persistent challenges when handling long-context tasks, most notably the lost in the middle issue, where information located in the middle of a long input tends to be underutilized. Some existing methods that reduce input have the risk of discarding key information, while others that extend context windows often lead to attention dispersion. To address these limitations, we propose Tree of Agents (TOA), a multi-agent reasoning framework that segments the input into chunks processed by independent agents. Each agent generates its local cognition, then agents dynamically exchange information for collaborative reasoning along tree-structured paths. TOA enables agents to probe different reasoning orders for multi-perspective understanding, effectively mitigating position bias and reducing hallucinations. To improve processing efficiency, we incorporate prefix-hash caching and adaptive pruning strategies, achieving significant performance improvements with comparable API overhead. Experiments show that TOA, powered by compact LLaMA3.1-8B, significantly outperforms multiple baselines and demonstrates comparable performance to the latest and much larger commercial models, such as Gemini1.5-pro, on various long-context tasks. Code is available at https://github.com/Aireduce952/Tree-of-Agents.
