Table of Contents
Fetching ...

CoEvo: Continual Evolution of Symbolic Solutions Using Large Language Models

Ping Guo, Qingfu Zhang, Xi Lin

TL;DR

CoEvo addresses open-ended symbolic discovery by uniting LLM-driven reasoning with evolutionary search in a framework that dynamically manages evolving knowledge. Solutions are generated and refined across multiple representations—natural language, $LaTeX$ formulas, and Python code—via an idea-tree process with scalable levels $N_0$, $N_1$, … and guided by a knowledge library of finite size $K$ (e.g., $K=30$). The knowledge library summarizes, clusters, and reuses ideas through Random Reuse and Similarity-based Reuse to enable continual improvement. On AI Feynman benchmarks, CoEvo achieves state-of-the-art NMSE across problems, often by large margins, and demonstrates robustness to backbone LLMs such as $gpt$-3.5-turbo and $gpt$-4o-mini, including discovering an implicit, data-driven equation via velocity differentiation.

Abstract

The discovery of symbolic solutions -- mathematical expressions, logical rules, and algorithmic structures -- is fundamental to advancing scientific and engineering progress. However, traditional methods often struggle with search efficiency and fail to integrate knowledge effectively. While recent large language model-based (LLM-based) approaches have demonstrated improvements in search efficiency, they lack the ability to continually refine and expand upon discovered solutions and their underlying knowledge, limiting their potential for open-ended innovation. To address these limitations, we introduce CoEvo, a novel framework that leverages large language models within an evolutionary search methodology to continually generate and refine symbolic solutions. CoEvo integrates a dynamic knowledge library, enabling open-ended innovation of solutions through effective knowledge management. Additionally, CoEvo leverages multiple representations of solutions -- including natural language, mathematical expressions, and code -- to further enhance search efficiency. By combining the reasoning capabilities of LLMs with the exploratory power of evolutionary algorithms, CoEvo significantly improves the efficiency and scope of symbolic discovery. Our experimental results demonstrate that this method not only enhances the efficiency of searching for symbolic solutions but also supports the ongoing discovery process, akin to human scientific endeavors. This study represents a first effort in conceptualizing the search for symbolic solutions as a lifelong, iterative process, marking a significant step towards harnessing LLMs in the perpetual pursuit of scientific and engineering breakthroughs. Our code is available at https://github.com/pgg3/CoEvo.

CoEvo: Continual Evolution of Symbolic Solutions Using Large Language Models

TL;DR

CoEvo addresses open-ended symbolic discovery by uniting LLM-driven reasoning with evolutionary search in a framework that dynamically manages evolving knowledge. Solutions are generated and refined across multiple representations—natural language, formulas, and Python code—via an idea-tree process with scalable levels , , … and guided by a knowledge library of finite size (e.g., ). The knowledge library summarizes, clusters, and reuses ideas through Random Reuse and Similarity-based Reuse to enable continual improvement. On AI Feynman benchmarks, CoEvo achieves state-of-the-art NMSE across problems, often by large margins, and demonstrates robustness to backbone LLMs such as -3.5-turbo and -4o-mini, including discovering an implicit, data-driven equation via velocity differentiation.

Abstract

The discovery of symbolic solutions -- mathematical expressions, logical rules, and algorithmic structures -- is fundamental to advancing scientific and engineering progress. However, traditional methods often struggle with search efficiency and fail to integrate knowledge effectively. While recent large language model-based (LLM-based) approaches have demonstrated improvements in search efficiency, they lack the ability to continually refine and expand upon discovered solutions and their underlying knowledge, limiting their potential for open-ended innovation. To address these limitations, we introduce CoEvo, a novel framework that leverages large language models within an evolutionary search methodology to continually generate and refine symbolic solutions. CoEvo integrates a dynamic knowledge library, enabling open-ended innovation of solutions through effective knowledge management. Additionally, CoEvo leverages multiple representations of solutions -- including natural language, mathematical expressions, and code -- to further enhance search efficiency. By combining the reasoning capabilities of LLMs with the exploratory power of evolutionary algorithms, CoEvo significantly improves the efficiency and scope of symbolic discovery. Our experimental results demonstrate that this method not only enhances the efficiency of searching for symbolic solutions but also supports the ongoing discovery process, akin to human scientific endeavors. This study represents a first effort in conceptualizing the search for symbolic solutions as a lifelong, iterative process, marking a significant step towards harnessing LLMs in the perpetual pursuit of scientific and engineering breakthroughs. Our code is available at https://github.com/pgg3/CoEvo.

Paper Structure

This paper contains 16 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: Conceptual comparison of symbolic discovery methods across search spaces of increasing complexity and knowledge richness. Traditional approaches (pre-LLM) operate in constrained mathematical/code spaces. LLM-based methods (FunSearch, LLM-SR) leverage iterative evolution in code space, while EoH incorporates natural language heuristics. CoEvo (proposed) fully exploits LLMs' reasoning in natural language space for open-ended evolution.
  • Figure 2: Three-step LLM-driven solution generation (inspiring/thinking/solving) via an idea tree: roots evolve through evaluator-guided refinement over levels, output in multi-format representations.
  • Figure 3: An overview of CoEvo.(a) Task of interest. (b) Tree-based solution generation for generating of a single solution in different formats. (c) Evolutionary search of solutions. (d) Knowledge library for storing and retrieving knowledge pieces.
  • Figure 4: Human thinking process. It is usually an iterative process of idea generation, evaluation, and refinement.
  • Figure 5: An illustration of the interaction between the knowledge library and the population. The red arrow represents the addition of knowledge to the library, while the green arrow denotes the reuse of knowledge.
  • ...and 5 more figures