Table of Contents
Fetching ...

Automatic Instruction Evolving for Large Language Models

Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen

TL;DR

Auto Evol-Instruct addresses the scalability bottleneck of instruction-tuning LLMs by automating instruction evolution. It uses a dual-LLM loop (evol LLM and optimizer LLM) to automatically design and refine evolving strategies, producing an evolved dataset $X_e$ that improves $Q(X_e)$. Across MT-Bench, AlpacaEval, GSM8K, and HumanEval, the auto-optimized methods outperform human-designed Evol-Instruct under comparable data budgets. This approach enables cost-efficient, cross-domain instruction tuning with reduced human supervision and broad practical impact.

Abstract

Fine-tuning large pre-trained language models with Evol-Instruct has achieved encouraging results across a wide range of tasks. However, designing effective evolving methods for instruction evolution requires substantial human expertise. This paper proposes Auto Evol-Instruct, an end-to-end framework that evolves instruction datasets using large language models without any human effort. The framework automatically analyzes and summarizes suitable evolutionary strategies for the given instruction data and iteratively improves the evolving method based on issues exposed during the instruction evolution process. Our extensive experiments demonstrate that the best method optimized by Auto Evol-Instruct outperforms human-designed methods on various benchmarks, including MT-Bench, AlpacaEval, GSM8K, and HumanEval.

Automatic Instruction Evolving for Large Language Models

TL;DR

Auto Evol-Instruct addresses the scalability bottleneck of instruction-tuning LLMs by automating instruction evolution. It uses a dual-LLM loop (evol LLM and optimizer LLM) to automatically design and refine evolving strategies, producing an evolved dataset that improves . Across MT-Bench, AlpacaEval, GSM8K, and HumanEval, the auto-optimized methods outperform human-designed Evol-Instruct under comparable data budgets. This approach enables cost-efficient, cross-domain instruction tuning with reduced human supervision and broad practical impact.

Abstract

Fine-tuning large pre-trained language models with Evol-Instruct has achieved encouraging results across a wide range of tasks. However, designing effective evolving methods for instruction evolution requires substantial human expertise. This paper proposes Auto Evol-Instruct, an end-to-end framework that evolves instruction datasets using large language models without any human effort. The framework automatically analyzes and summarizes suitable evolutionary strategies for the given instruction data and iteratively improves the evolving method based on issues exposed during the instruction evolution process. Our extensive experiments demonstrate that the best method optimized by Auto Evol-Instruct outperforms human-designed methods on various benchmarks, including MT-Bench, AlpacaEval, GSM8K, and HumanEval.
Paper Structure (34 sections, 2 equations, 14 figures, 10 tables)

This paper contains 34 sections, 2 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Overall architecture of Auto Evol-Instruct. It illustrates the process of optimizing the initial evolving method $e_{0}$ into the optimal evolving method $e^{*}$, which specifically outlines the transition from $e_{t-1}$ to $e_{t}$. The yellow part and green part denote Evol Trajectory Analysis and Evolving Method Optimization respectively. $x^{(1)}$ to $x^{(l)}$ represents the example of evolutionary trajectory obtained by the evol LLM guided by $e_{t-1}$ evolving $x$ for $l$ rounds. The feedback and potential improved evolving methods obtained from $m$ Multiple Optimizations denote $f_{t}^{1}$ to $f_{t}^{m}$ and $e_{t}^{1}$ to $e_{t}^{m}$ respectively.
  • Figure 2: Initial Evolving Method. Under this method, the Evol LLM evolves the instruction. Auto Evol-Instruct will optimize this method into an optimal version for evolving the entire dataset of instructions efficiently.
  • Figure 3: Effect of the Initial Evolving Method. GPT-3.5-turbo as evol LLM, GPT-4 as optimizer LLM.
  • Figure 4: Effect of Auto Evol-Instruct on Initial Evolving Methods. GPT-3.5-turbo as evol LLM, GPT-4 as optimizer LLM. Default and Weak respectively represent original and simple evolving method
  • Figure 5: Hyperparameters for Auto Evol-Instruct. GPT-3.5-turbo as evol LLM, GPT-4 as optimizer LLM.
  • ...and 9 more figures