DocAgent: A Multi-Agent System for Automated Code Documentation Generation
Dayu Yang, Antoine Simoulin, Xin Qian, Xiaoyi Liu, Yuwei Cao, Zhaopu Teng, Grey Yang
TL;DR
DocAgent tackles the unreliable documentation problem in large and proprietary codebases by introducing a dependency-aware, topologically structured multi-agent system that incrementally builds context. The Navigator computes a dependency-first generation order on a repository's AST-derived DAG, enabling Reader, Searcher, Writer, Verifier, and Orchestrator to collaboratively draft, verify, and refine documentation. The paper also proposes a robust, multi-faceted evaluation framework focusing on Completeness, Helpfulness, and Truthfulness, validated through extensive experiments showing DocAgent outperforms state-of-the-art baselines. The results highlight the practical impact of topological processing and adaptive context management for reliable, scalable automatic code documentation generation, with careful attention to ethics and limitations.
Abstract
High-quality code documentation is crucial for software development especially in the era of AI. However, generating it automatically using Large Language Models (LLMs) remains challenging, as existing approaches often produce incomplete, unhelpful, or factually incorrect outputs. We introduce DocAgent, a novel multi-agent collaborative system using topological code processing for incremental context building. Specialized agents (Reader, Searcher, Writer, Verifier, Orchestrator) then collaboratively generate documentation. We also propose a multi-faceted evaluation framework assessing Completeness, Helpfulness, and Truthfulness. Comprehensive experiments show DocAgent significantly outperforms baselines consistently. Our ablation study confirms the vital role of the topological processing order. DocAgent offers a robust approach for reliable code documentation generation in complex and proprietary repositories.
