Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Tianqing Fang, Zhisong Zhang, Xiaoyang Wang, Rui Wang, Can Qin, Yuxuan Wan, Jun-Yu Ma, Ce Zhang, Jiaqi Chen, Xiyun Li, Hongming Zhang, Haitao Mi, Dong Yu
TL;DR
Cognitive Kernel-Pro introduces a fully open-source, multi-module framework for deep research agents that emphasizes modularity, data-centric training, and inference-time optimization. It combines a main coordinating agent with specialized sub-agents to handle web navigation, file processing, and tool use, all powered by a shared agent foundation model and Python-based actions. The authors present a comprehensive data-collection and augmentation recipe (including multi-hop web data, exploration-based synthesis, and Persona Hub prompts) and demonstrate state-of-the-art results on GAIA among open-source, free-tool agents, with an 8B CK-Pro model outperforming prior open-source rivals. The work also showcases techniques like reflection and voting to improve robustness, and provides a public codebase to foster reproducibility and further research in open agent systems.
Abstract
General AI Agents are increasingly recognized as foundational frameworks for the next generation of artificial intelligence, enabling complex reasoning, web interaction, coding, and autonomous research capabilities. However, current agent systems are either closed-source or heavily reliant on a variety of paid APIs and proprietary tools, limiting accessibility and reproducibility for the research community. In this work, we present \textbf{Cognitive Kernel-Pro}, a fully open-source and (to the maximum extent) free multi-module agent framework designed to democratize the development and evaluation of advanced AI agents. Within Cognitive Kernel-Pro, we systematically investigate the curation of high-quality training data for Agent Foundation Models, focusing on the construction of queries, trajectories, and verifiable answers across four key domains: web, file, code, and general reasoning. Furthermore, we explore novel strategies for agent test-time reflection and voting to enhance agent robustness and performance. We evaluate Cognitive Kernel-Pro on GAIA, achieving state-of-the-art results among open-source and free agents. Notably, our 8B-parameter open-source model surpasses previous leading systems such as WebDancer and WebSailor, establishing a new performance standard for accessible, high-capability AI agents. Code is available at https://github.com/Tencent/CognitiveKernel-Pro
