EvoMerge: Neuroevolution for Large Language Models
Yushu Jiang
TL;DR
The paper addresses the limitation of standard fine-tuning in large language models by proposing EvoMerge, a neuroevolution framework that uses model merging for weight crossover and fine-tuning for weight mutation to explore diverse training trajectories. It outlines a six-step evolutionary loop and details a prototype workflow (initialization, evaluation, selection, crossover, mutation) with SLERP-based merging and DPO-inspired tuning, implemented in small-scale experiments. The reported experiments assess EvoMerge on benchmarks such as HellaSwag, WinoGrande, ARC, MMLU, TruthfulQA, Winogrande, and GSM8K, illustrating the approach's feasibility and exploratory results while highlighting instability and the need for further design. The work aims to provide a framework that can uncover robust, generalizable improvements beyond conventional fine-tuning by leveraging diversity and recombination of training signals.
Abstract
Extensive fine-tuning on Large Language Models does not always yield better results. Oftentimes, models tend to get better at imitating one form of data without gaining greater reasoning ability and may even end up losing some intelligence. Here I introduce EvoMerge, a systematic approach to large language model training and merging. Leveraging model merging for weight crossover and fine-tuning for weight mutation, EvoMerge establishes an evolutionary process aimed at pushing models beyond the limits of conventional fine-tuning.
