Table of Contents
Fetching ...

COCO: Cognitive Operating System with Continuous Oversight for Multi-Agent Workflow Reliability

Churong Liang, Jinling Gan, Kairan Hong, Qiushi Tian, Zongze Wu, Runnan Li

Abstract

A critical limitation in large-scale multi-agent systems is the cascading of errors. And without intermediate verification, downstream agents exacerbate upstream inaccuracies, resulting in significant quality degradation. To bridge this gap, we introduce \textbf{COCO} (\textbf{C}ognitive \textbf{O}perating System with \textbf{C}ontinuous \textbf{O}versight), a theoretically grounded framework for asynchronous self-monitoring and adaptive error correction in multi-agent systems. COCO reconciles the fundamental tension between quality assurance and computational efficiency via a novel decoupled architecture. This design isolates error detection from the critical execution path and incorporates an automated configuration engine to minimize deployment complexity. The framework relies on three algorithmic innovations to mitigate both systematic and stochastic errors: (1) a Contextual Rollback Mechanism that leverages execution history for informed state recovery rather than naive retries; (2) a Bidirectional Reflection Protocol to ensure convergence and prevent oscillatory control loops; and (3) a Heterogeneous Cross-Validation Mechanism that utilizes ensemble disagreement to identify bias and hallucinations. Extensive experiments on diverse benchmarks demonstrate that COCO delivers a 6.5\% average performance improvement. Notably, the framework achieves 95.1\% of large-model performance with a 30$\times$ parameter reduction, confirming the potential for efficient, high-reliability deployment, and establishing COCO as a practical, annotation-based solution for critical autonomous domains.

COCO: Cognitive Operating System with Continuous Oversight for Multi-Agent Workflow Reliability

Abstract

A critical limitation in large-scale multi-agent systems is the cascading of errors. And without intermediate verification, downstream agents exacerbate upstream inaccuracies, resulting in significant quality degradation. To bridge this gap, we introduce \textbf{COCO} (\textbf{C}ognitive \textbf{O}perating System with \textbf{C}ontinuous \textbf{O}versight), a theoretically grounded framework for asynchronous self-monitoring and adaptive error correction in multi-agent systems. COCO reconciles the fundamental tension between quality assurance and computational efficiency via a novel decoupled architecture. This design isolates error detection from the critical execution path and incorporates an automated configuration engine to minimize deployment complexity. The framework relies on three algorithmic innovations to mitigate both systematic and stochastic errors: (1) a Contextual Rollback Mechanism that leverages execution history for informed state recovery rather than naive retries; (2) a Bidirectional Reflection Protocol to ensure convergence and prevent oscillatory control loops; and (3) a Heterogeneous Cross-Validation Mechanism that utilizes ensemble disagreement to identify bias and hallucinations. Extensive experiments on diverse benchmarks demonstrate that COCO delivers a 6.5\% average performance improvement. Notably, the framework achieves 95.1\% of large-model performance with a 30 parameter reduction, confirming the potential for efficient, high-reliability deployment, and establishing COCO as a practical, annotation-based solution for critical autonomous domains.

Paper Structure

This paper contains 20 sections, 4 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: By intercepting intermediate execution errors through asynchronous monitoring, COCO initiates a self-repair loop that prevents the downstream propagation of inaccuracies.
  • Figure 2: Architectural overview of the COCO framework. The system is composed of three decoupled layers: (1) The Decorator Binding Layer facilitates non-intrusive, asynchronous oversight via agent-level hooks; (2) The Monitoring Engine leverages LLMs to synthesize structured diagnostic reports from execution outputs; and (3) The Correction Engine orchestrates state restoration, integrating CRM for contextual rollback with BRP and HCV to guarantee correction stability.
  • Figure 3: Case study of a COCO-integrated workflow for generating recommendations from product reviews. COCO monitors node2 asynchronously; in the lower path, the workflow continues before correction occurs. In the upper path, COCO detects an omission and rolls back node2, producing an improved result.