SEW: Self-Evolving Agentic Workflows for Automated Code Generation

Siwei Liu; Jinyuan Fang; Han Zhou; Yingxu Wang; Zaiqiao Meng

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

Siwei Liu, Jinyuan Fang, Han Zhou, Yingxu Wang, Zaiqiao Meng

TL;DR

SEW introduces a self-evolving framework that automatically designs and optimizes multi-agent workflows and per-agent prompts for automated code generation. By employing workflow generation, evolution, and agent evolution guided by mutation operators, SEW discovers novel topologies and high-quality prompts, outperforming strong baselines across three benchmarks. The study analyzes five textual workflow representations, finding CoRE to offer the best balance between interpretability and executability, and demonstrates that both workflow and agent evolution contribute to performance gains. These results highlight SEW’s potential to reduce manual workflow design and enable adaptive, scalable agentic systems for software engineering tasks, while outlining limitations and avenues for broader applicability.

Abstract

Large Language Models (LLMs) have demonstrated effectiveness in code generation tasks. To enable LLMs to address more complex coding challenges, existing research has focused on crafting multi-agent systems with agentic workflows, where complex coding tasks are decomposed into sub-tasks, assigned to specialized agents. Despite their effectiveness, current approaches heavily rely on hand-crafted agentic workflows, with both agent topologies and prompts manually designed, which limits their ability to automatically adapt to different types of coding problems. To address these limitations and enable automated workflow design, we propose \textbf{S}elf-\textbf{E}volving \textbf{W}orkflow (\textbf{SEW}), a novel self-evolving framework that automatically generates and optimises multi-agent workflows. Extensive experiments on three coding benchmark datasets, including the challenging LiveCodeBench, demonstrate that our SEW can automatically design agentic workflows and optimise them through self-evolution, bringing up to 33\% improvement on LiveCodeBench compared to using the backbone LLM only. Furthermore, by investigating different representation schemes of workflow, we provide insights into the optimal way to encode workflow information with text.

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

TL;DR

Abstract

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)