Table of Contents
Fetching ...

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution

Jiuding Yang, Shengyao Lu, Weidong Guo, Xiangyang Li, Kaitong Yang, Yu Xu, Di Niu

TL;DR

Applied across multiple domains, LLMs fine-tuned with TaCIE have substantially outperformed those tuned with conventional methods, marking a significant advancement in instruction-based model fine-tuning.

Abstract

Large Language Models (LLMs) require precise alignment with complex instructions to optimize their performance in real-world applications. As the demand for refined instruction tuning data increases, traditional methods that evolve simple seed instructions often struggle to effectively enhance complexity or manage difficulty scaling across various domains. Our innovative approach, Task-Centered Instruction Evolution (TaCIE), addresses these shortcomings by redefining instruction evolution from merely evolving seed instructions to a more dynamic and comprehensive combination of elements. TaCIE starts by deconstructing complex instructions into their fundamental components. It then generates and integrates new elements with the original ones, reassembling them into more sophisticated instructions that progressively increase in difficulty, diversity, and complexity. Applied across multiple domains, LLMs fine-tuned with these evolved instructions have substantially outperformed those tuned with conventional methods, marking a significant advancement in instruction-based model fine-tuning.

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution

TL;DR

Applied across multiple domains, LLMs fine-tuned with TaCIE have substantially outperformed those tuned with conventional methods, marking a significant advancement in instruction-based model fine-tuning.

Abstract

Large Language Models (LLMs) require precise alignment with complex instructions to optimize their performance in real-world applications. As the demand for refined instruction tuning data increases, traditional methods that evolve simple seed instructions often struggle to effectively enhance complexity or manage difficulty scaling across various domains. Our innovative approach, Task-Centered Instruction Evolution (TaCIE), addresses these shortcomings by redefining instruction evolution from merely evolving seed instructions to a more dynamic and comprehensive combination of elements. TaCIE starts by deconstructing complex instructions into their fundamental components. It then generates and integrates new elements with the original ones, reassembling them into more sophisticated instructions that progressively increase in difficulty, diversity, and complexity. Applied across multiple domains, LLMs fine-tuned with these evolved instructions have substantially outperformed those tuned with conventional methods, marking a significant advancement in instruction-based model fine-tuning.
Paper Structure (22 sections, 12 equations, 10 figures, 1 table)

This paper contains 22 sections, 12 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Real examples of applying Evol-Instruct using GPT-4.
  • Figure 2: An illustration of TaCIE during the $t$-th round of evolution.
  • Figure 3: Task Fusion Sampling
  • Figure 4: The domain distribution. Note each fused instruction contributes to multiple domains due to objectives from two seeds.
  • Figure 5: The distribution of evolution rounds.
  • ...and 5 more figures