Table of Contents
Fetching ...

EvoFlow: Evolving Diverse Agentic Workflows On The Fly

Guibin Zhang, Kaijie Chen, Guancheng Wan, Heng Chang, Hong Cheng, Kun Wang, Shuyue Hu, Lei Bai

TL;DR

EvoFlow reframes agentic workflow design as a cost-aware, multi-objective optimization problem and uses a niching evolutionary framework to evolve a diverse population of heterogeneous, complexity-adaptive workflows. By operating with operator nodes and a tag-based retrieval, crossover, and mutation pipeline, EvoFlow maintains diversity while optimizing for performance and cost, generating Pareto-front workflows tailored to varying query difficulty. Extensive experiments across six benchmarks show EvoFlow achieving higher performance than handcrafted and automated baselines while reducing inference and training costs, especially when using open-source models in a heterogeneous setting. The work highlights practical gains in efficiency and adaptability for real-world multi-agent systems, enabling scalable, cost-effective deployment of diverse agentic architectures.

Abstract

The past two years have witnessed the evolution of large language model (LLM)-based multi-agent systems from labor-intensive manual design to partial automation (\textit{e.g.}, prompt engineering, communication topology) and eventually to fully automated design. However, existing agentic automation pipelines often lack LLM heterogeneity and focus on single-objective performance optimization, limiting their potential to combine weaker models for more customized and cost-effective solutions. To address this challenge, we propose EvoFlow, a niching evolutionary algorithm-based framework to automatically search a population of heterogeneous and complexity-adaptive agentic workflows, rather than a single homogeneous, complex workflow. Technically, EvoFlow performs \textit{(1) tag-based retrieval} to extract parent workflows from an agentic population, evolves new workflows through \textit{(2) crossover} and \textit{(3) mutation}, and employs \textit{(4) niching-based selection} to maintain population diversity and quality. Extensive evaluations across seven benchmarks demonstrate that EvoFlow is: \textbf{(I) diverse}, evolving a population of workflows ranging from simple I/O tasks to complex multi-turn interactions; \textbf{(II) high-performing}, outperforming previous handcrafted and automated workflows by $1.23\%\sim29.86\%$; \textbf{(III) economical}, surpassing powerful \llmname{o1-preview} at $12.4\%$ of its inference cost using weaker open-source models.

EvoFlow: Evolving Diverse Agentic Workflows On The Fly

TL;DR

EvoFlow reframes agentic workflow design as a cost-aware, multi-objective optimization problem and uses a niching evolutionary framework to evolve a diverse population of heterogeneous, complexity-adaptive workflows. By operating with operator nodes and a tag-based retrieval, crossover, and mutation pipeline, EvoFlow maintains diversity while optimizing for performance and cost, generating Pareto-front workflows tailored to varying query difficulty. Extensive experiments across six benchmarks show EvoFlow achieving higher performance than handcrafted and automated baselines while reducing inference and training costs, especially when using open-source models in a heterogeneous setting. The work highlights practical gains in efficiency and adaptability for real-world multi-agent systems, enabling scalable, cost-effective deployment of diverse agentic architectures.

Abstract

The past two years have witnessed the evolution of large language model (LLM)-based multi-agent systems from labor-intensive manual design to partial automation (\textit{e.g.}, prompt engineering, communication topology) and eventually to fully automated design. However, existing agentic automation pipelines often lack LLM heterogeneity and focus on single-objective performance optimization, limiting their potential to combine weaker models for more customized and cost-effective solutions. To address this challenge, we propose EvoFlow, a niching evolutionary algorithm-based framework to automatically search a population of heterogeneous and complexity-adaptive agentic workflows, rather than a single homogeneous, complex workflow. Technically, EvoFlow performs \textit{(1) tag-based retrieval} to extract parent workflows from an agentic population, evolves new workflows through \textit{(2) crossover} and \textit{(3) mutation}, and employs \textit{(4) niching-based selection} to maintain population diversity and quality. Extensive evaluations across seven benchmarks demonstrate that EvoFlow is: \textbf{(I) diverse}, evolving a population of workflows ranging from simple I/O tasks to complex multi-turn interactions; \textbf{(II) high-performing}, outperforming previous handcrafted and automated workflows by ; \textbf{(III) economical}, surpassing powerful \llmname{o1-preview} at of its inference cost using weaker open-source models.

Paper Structure

This paper contains 46 sections, 20 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Paradigm comparison. Baseline methods seek a "one-size-fits-all" complex homogenoues workflow, while EvoFlow optimizes a Pareto set of diverse, heterogenous workflows.
  • Figure 2: The visualization of notations in EvoFlow.
  • Figure 3: The overall framework of EvoFlow. The fundamental unit is the invoking nodes, which collectively form the operator node. EvoFlow initializes the population by combining multiple operator nodes into a workflow (individual), followed by tag-based retrieval and crossover & mutation to generate novel offspring workflows. The population is updated via niching-based selection.
  • Figure 4: The cost-performance plane of workflows from EvoFlow, DyLAN, and AFlow.
  • Figure 5: The ablation study of EvoFlow.
  • ...and 1 more figures