Table of Contents
Fetching ...

EvoSpeak: Large Language Models for Interpretable Genetic Programming-Evolved Heuristics

Meng Xu, Jiao Liu, Yew Soon Ong

TL;DR

The paper tackles the opacity and limited transferability of GP-evolved heuristics in dynamic optimization by introducing EvoSpeak, which uses large language models (LLMs) to extract knowledge from existing heuristics and to generate warm-start populations, as well as to translate evolved rules into human-readable explanations. The method integrates an offline, LLM-driven pre-processing stage with GP optimization, leveraging a knowledge-augmented objective that balances multiple criteria via a weighted-sum formulation in practice. Empirical evaluation on dynamic flexible job shop scheduling (DFJSS) demonstrates that EvoSpeak improves initial population quality, accelerates convergence, and yields interpretable reports, while enabling cross-task knowledge transfer and preference-aware customization. This approach advances practical deployment of GP-based heuristics by combining symbolic search with natural-language interpretability and transferability, enabling more transparent and adaptable decision-support in manufacturing and related domains.

Abstract

Genetic programming (GP) has demonstrated strong effectiveness in evolving tree-structured heuristics for complex optimization problems. Yet, in dynamic and large-scale scenarios, the most effective heuristics are often highly complex, hindering interpretability, slowing convergence, and limiting transferability across tasks. To address these challenges, we present EvoSpeak, a novel framework that integrates GP with large language models (LLMs) to enhance the efficiency, transparency, and adaptability of heuristic evolution. EvoSpeak learns from high-quality GP heuristics, extracts knowledge, and leverages this knowledge to (i) generate warm-start populations that accelerate convergence, (ii) translate opaque GP trees into concise natural-language explanations that foster interpretability and trust, and (iii) enable knowledge transfer and preference-aware heuristic generation across related tasks. We verify the effectiveness of EvoSpeak through extensive experiments on dynamic flexible job shop scheduling (DFJSS), under both single- and multi-objective formulations. The results demonstrate that EvoSpeak produces more effective heuristics, improves evolutionary efficiency, and delivers human-readable reports that enhance usability. By coupling the symbolic reasoning power of GP with the interpretative and generative strengths of LLMs, EvoSpeak advances the development of intelligent, transparent, and user-aligned heuristics for real-world optimization problems.

EvoSpeak: Large Language Models for Interpretable Genetic Programming-Evolved Heuristics

TL;DR

The paper tackles the opacity and limited transferability of GP-evolved heuristics in dynamic optimization by introducing EvoSpeak, which uses large language models (LLMs) to extract knowledge from existing heuristics and to generate warm-start populations, as well as to translate evolved rules into human-readable explanations. The method integrates an offline, LLM-driven pre-processing stage with GP optimization, leveraging a knowledge-augmented objective that balances multiple criteria via a weighted-sum formulation in practice. Empirical evaluation on dynamic flexible job shop scheduling (DFJSS) demonstrates that EvoSpeak improves initial population quality, accelerates convergence, and yields interpretable reports, while enabling cross-task knowledge transfer and preference-aware customization. This approach advances practical deployment of GP-based heuristics by combining symbolic search with natural-language interpretability and transferability, enabling more transparent and adaptable decision-support in manufacturing and related domains.

Abstract

Genetic programming (GP) has demonstrated strong effectiveness in evolving tree-structured heuristics for complex optimization problems. Yet, in dynamic and large-scale scenarios, the most effective heuristics are often highly complex, hindering interpretability, slowing convergence, and limiting transferability across tasks. To address these challenges, we present EvoSpeak, a novel framework that integrates GP with large language models (LLMs) to enhance the efficiency, transparency, and adaptability of heuristic evolution. EvoSpeak learns from high-quality GP heuristics, extracts knowledge, and leverages this knowledge to (i) generate warm-start populations that accelerate convergence, (ii) translate opaque GP trees into concise natural-language explanations that foster interpretability and trust, and (iii) enable knowledge transfer and preference-aware heuristic generation across related tasks. We verify the effectiveness of EvoSpeak through extensive experiments on dynamic flexible job shop scheduling (DFJSS), under both single- and multi-objective formulations. The results demonstrate that EvoSpeak produces more effective heuristics, improves evolutionary efficiency, and delivers human-readable reports that enhance usability. By coupling the symbolic reasoning power of GP with the interpretative and generative strengths of LLMs, EvoSpeak advances the development of intelligent, transparent, and user-aligned heuristics for real-world optimization problems.

Paper Structure

This paper contains 28 sections, 8 equations, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: An example of a tree-structured routing rule evolved for a scheduling problem. Yellow nodes denote system features (e.g., WIQ, MWT), while blue nodes denote operators (e.g., max, min, subtraction). While effective, such trees can be large and difficult to interpret.
  • Figure 2: Overall EvoSpeak framework. LLMs act as both a knowledge extraction engine and a symbolic interpreter, integrated into the GP loop.
  • Figure 3: An example prompt for population initialization using an LLM, incorporating existing scheduling heuristics and user preferences for a multi-objective DFJSS problem.
  • Figure 4: The fitness distribution of the initial population by GP and EvoSpeak.
  • Figure 5: The convergence curves of test performance of 30 independent runs of GP and EvoSpeak with different preferences across 4 scenarios under preference (0.8, 0.2).
  • ...and 3 more figures