Table of Contents
Fetching ...

Nature-Inspired Population-Based Evolution of Large Language Models

Yiqun Zhang, Peng Ye, Xiaocui Yang, Shi Feng, Shufei Zhang, Lei Bai, Wanli Ouyang, Shuyue Hu

TL;DR

This work defines population-based evolution for large language models by treating model weights as genes and task performance as fitness. It introduces GENOME and GENOME+, gradient-free frameworks that apply genetic-algorithm operations—crossover, mutation, and selection—while GENOME+ adds Succession and Ensemble to exploit collective knowledge and robustness. Across 12 datasets, GENOME(+) consistently outperforms seven baselines, achieving up to $54.80\%$ gains on challenging tasks like DROP and demonstrating zero-shot generalization to unseen tasks, even with populations up to 40 models. The approach runs on modest hardware (a single $24$GB GPU) and is fully open-sourced, highlighting a practical, scalable path to rapidly adapt LLMs without backpropagation.

Abstract

Evolution, the engine behind the survival and growth of life on Earth, operates through the population-based process of reproduction. Inspired by this principle, this paper formally defines a newly emerging problem -- the population-based evolution of large language models (LLMs) -- and introduces a novel framework. Starting with a population of parent LLMs, our framework enables the population to evolve through four key operations: (i) crossover, merging the weights of different parents to create offspring LLMs, (ii) mutation, introducing small, random changes to model weights to foster diversity, (iii) selection, prioritizing high-performing models, and (iv) succession, transferring the learned experience from parent to offspring LLMs. With only 200 samples per new task, the LLM population evolves rapidly to adapt to the task at hand, without any gradients. Experiments on 12 datasets show that our framework consistently outperforms existing multi-LLM merging and adaptation methods, achieving accuracy gains of up to 54.8% over the best LLM in the initial population. Moreover, our framework allows for the evolution of LLMs across multiple new tasks simultaneously, scaling effectively with populations of up to 40 LLMs, and even zero-shot generalization to unseen held-out tasks. We have open-sourced the code on GitHub and released the weights of 10 parent LLMs, fine-tuned from gemma-2-2b-it, on HuggingFace$, enabling reproduction of our proposed framework using just a single 4090 GPU with 24GB memory, without any performance degradation.

Nature-Inspired Population-Based Evolution of Large Language Models

TL;DR

This work defines population-based evolution for large language models by treating model weights as genes and task performance as fitness. It introduces GENOME and GENOME+, gradient-free frameworks that apply genetic-algorithm operations—crossover, mutation, and selection—while GENOME+ adds Succession and Ensemble to exploit collective knowledge and robustness. Across 12 datasets, GENOME(+) consistently outperforms seven baselines, achieving up to gains on challenging tasks like DROP and demonstrating zero-shot generalization to unseen tasks, even with populations up to 40 models. The approach runs on modest hardware (a single GB GPU) and is fully open-sourced, highlighting a practical, scalable path to rapidly adapt LLMs without backpropagation.

Abstract

Evolution, the engine behind the survival and growth of life on Earth, operates through the population-based process of reproduction. Inspired by this principle, this paper formally defines a newly emerging problem -- the population-based evolution of large language models (LLMs) -- and introduces a novel framework. Starting with a population of parent LLMs, our framework enables the population to evolve through four key operations: (i) crossover, merging the weights of different parents to create offspring LLMs, (ii) mutation, introducing small, random changes to model weights to foster diversity, (iii) selection, prioritizing high-performing models, and (iv) succession, transferring the learned experience from parent to offspring LLMs. With only 200 samples per new task, the LLM population evolves rapidly to adapt to the task at hand, without any gradients. Experiments on 12 datasets show that our framework consistently outperforms existing multi-LLM merging and adaptation methods, achieving accuracy gains of up to 54.8% over the best LLM in the initial population. Moreover, our framework allows for the evolution of LLMs across multiple new tasks simultaneously, scaling effectively with populations of up to 40 LLMs, and even zero-shot generalization to unseen held-out tasks. We have open-sourced the code on GitHub and released the weights of 10 parent LLMs, fine-tuned from gemma-2-2b-it, on HuggingFace$, enabling reproduction of our proposed framework using just a single 4090 GPU with 24GB memory, without any performance degradation.

Paper Structure

This paper contains 35 sections, 6 equations, 14 figures, 11 tables, 2 algorithms.

Figures (14)

  • Figure 1: GENOME+: a population-based evolutionary framework, including crossover, mutation, succession, selection and ensemble operations, for LLMs.
  • Figure 2: Performance trends with increasing population sizes ($N$) across different methods.
  • Figure 3: Capability distribution of expert models across seven dimensions, highlighting their specialized strengths.
  • Figure 4: The prompt of MMLU.
  • Figure 5: The prompt of MMLUPro.
  • ...and 9 more figures