CoWork-X: Experience-Optimized Co-Evolution for Multi-Agent Collaboration System
Zexin Lin, Jiachen Yu, Haoyang Zhang, Yuzhao Li, Zhonghang Li, Yujiu Yang, Junjie Wang, Xiaoqiang Ji
TL;DR
CoWork-X tackles real-time coordination and continual cross-episode adaptation under strict online budgets by introducing an Execute--Optimize loop that separates fast, HTN-based execution from offline, patch-style skill updates. A shared HTN skill library $\,\mathcal{S}_k$ is incrementally improved by a post-episode Co-Optimizer using episode logs, enabling stable multi-agent co-evolution. In Overcooked-like benchmarks, CoWork-X achieves sustained gains with zero online tokens and markedly lower latency ($\approx$ $2.6$ s per episode) compared to baselines that rely on frequent in-episode LLM reasoning, while generalizing across multiple LLM backbones. The work demonstrates practical, scalable cross-episode collaboration and highlights the value of log-grounded, verifier-driven skill consolidation for real-time multi-agent systems.
Abstract
Large language models are enabling language-conditioned agents in interactive environments, but highly cooperative tasks often impose two simultaneous constraints: sub-second real-time coordination and sustained multi-episode adaptation under a strict online token budget. Existing approaches either rely on frequent in-episode reasoning that induces latency and timing jitter, or deliver post-episode improvements through unstructured text that is difficult to compile into reliable low-cost execution. We propose CoWork-X, an active co-evolution framework that casts peer collaboration as a closed-loop optimization problem across episodes, inspired by fast--slow memory separation. CoWork-X instantiates a Skill-Agent that executes via HTN (hierarchical task network)-based skill retrieval from a structured, interpretable, and compositional skill library, and a post-episode Co-Optimizer that performs patch-style skill consolidation with explicit budget constraints and drift regularization. Experiments in challenging Overcooked-AI-like realtime collaboration benchmarks demonstrate that CoWork-X achieves stable, cumulative performance gains while steadily reducing online latency and token usage.
