Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

Zhaotian Weng; Antonis Antoniades; Deepak Nathani; Zhen Zhang; Xiao Pu; Xin Eric Wang

Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

Zhaotian Weng, Antonis Antoniades, Deepak Nathani, Zhen Zhang, Xiao Pu, Xin Eric Wang

TL;DR

Group-Evolving Agents (GEA) propose a paradigm shift from single-agent, tree-structured evolution to group-level evolution with explicit intra-group experience sharing. By maintaining an archive of discovered agents and using a two-stage process—selecting a parent group via a Performance–Novelty criterion and generating an offspring group from aggregated experiences—GEA achieves substantially higher performance on coding benchmarks than state-of-the-art open-ended baselines and rivals human-designed frameworks. The results show that GEAs consolidate exploratory diversity into durable improvements, transfer across different coding models, and exhibit robustness to framework-level perturbations, all driven by meta-learning-inspired self-improvement without human intervention. This work highlights the potential of group-centric open-ended evolution for autonomous, scalable software design and tool-use improvements, while underscoring the need to manage computational resources and alignment considerations.

Abstract

Open-ended self-improving agents can autonomously modify their own structural designs to advance their capabilities and overcome the limits of pre-defined architectures, thus reducing reliance on human intervention. We introduce Group-Evolving Agents (GEA), a new paradigm for open-ended self-improvements, which treats a group of agents as the fundamental evolutionary unit, enabling explicit experience sharing and reuse within the group throughout evolution. Unlike existing open-ended self-evolving paradigms that adopt tree-structured evolution, GEA overcomes the limitation of inefficient utilization of exploratory diversity caused by isolated evolutionary branches. We evaluate GEA on challenging coding benchmarks, where it significantly outperforms state-of-the-art self-evolving methods (71.0% vs. 56.7% on SWE-bench Verified, 88.3% vs. 68.3% on Polyglot) and matches or exceeds top human-designed agent frameworks (71.8% and 52.0% on two benchmarks, respectively). Analysis reveals that GEA more effectively converts early-stage exploratory diversity into sustained, long-term progress, achieving stronger performance under the same number of evolved agents. Furthermore, GEA exhibits consistent transferability across different coding models and greater robustness, fixing framework-level bugs in 1.4 iterations on average, versus 5 for self-evolving methods.

Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

TL;DR

Abstract

Paper Structure (25 sections, 3 equations, 5 figures, 3 tables, 2 algorithms)

This paper contains 25 sections, 3 equations, 5 figures, 3 tables, 2 algorithms.

Introduction
Related Work
Method
Parent Group Selection
Open-Ended Group Evolution
Experiments
Benchmarks
SWE-bench.
Polyglot.
Experimental Settings
Baselines
Open-Ended Self-Evolving Baseline.
Human-Designed Frameworks.
Results and Analysis
Main Results
...and 10 more sections

Figures (5)

Figure 1: Overview of Group-Evolving Agents (GEA) vs. tree-structured self-evolution for open-endedness. GEA treats a group of agents, rather than an individual agent, as the fundamental unit of evolution. At each iteration, a parent group jointly gives rise to an offspring group through explicit intra-group Experience sharing and reuse.
Figure 2: Detailed illustration of group-level evolution in GEA. Aggregated evolutionary traces from the parent group are shared across all agents to generate evolution directives and framework-level patches.
Figure 3: Performance comparison between GEA and DGM (self-evolving baseline) on two coding benchmarks. Under the same number of evolved agents, GEA exhibits substantially larger performance gains than DGM on both SWE-bench and Polyglot, demonstrating the improved efficiency of group-level evolution.
Figure 4: Evolution analysis of tool discovery and integration over iterations. Each row (T1--T9) corresponds to a key tool-level functionality. Blue markers indicate tools that have been discovered but not yet integrated into the current best agent, while red markers indicate tools integrated into the best-performing agent.
Figure 5: Model transfer results on both benchmarks. Across all coding models, the GEA best agent consistently outperforms the corresponding initial (iteration--0) agent, demonstrating that the improvements induced by group-level evolution generalize across different underlying model backbones.

Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

TL;DR

Abstract

Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

Authors

TL;DR

Abstract

Table of Contents

Figures (5)