Table of Contents
Fetching ...

SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation

Jungwoo Kim, Minsang Kim, Sungjin Lee

TL;DR

SeDi-Instruct addresses data scarcity and quality in instruction tuning by combining diversity-based filtering with iterative feedback task generation, enabling high-quality seed instructions at reduced cost. The method relaxes similarity-based filtering to preserve batch diversity and integrates training signals to iteratively refine seeds, maintaining data usefulness while lowering API usage. Empirical results show a 5.2% accuracy uplift and a 36% reduction in data-generation costs compared with Self-Instruct, with competitive performance close to the ideal Llama-3-8B-Instruct on several benchmarks. The work highlights practical benefits for industry-scale instruction tuning, though it notes safety considerations and the dependence on generation model capacity for optimal outcomes.

Abstract

The rapid evolution of Large Language Models (LLMs) has enabled the industry to develop various AI-based services. Instruction tuning is considered essential in adapting foundation models for target domains to provide high-quality services to customers. A key challenge in instruction tuning is obtaining high-quality instruction data. Self-Instruct, which automatically generates instruction data using ChatGPT APIs, alleviates the data scarcity problem. To improve the quality of instruction data, Self-Instruct discards many of the instructions generated from ChatGPT, even though it is inefficient in terms of cost owing to many useless API calls. To generate high-quality instruction data at a low cost, we propose a novel data generation framework, Self-Direct Instruction generation (SeDi-Instruct), which employs diversity-based filtering and iterative feedback task generation. Diversity-based filtering maintains model accuracy without excessively discarding low-quality generated instructions by enhancing the diversity of instructions in a batch. This reduces the cost of synthesizing instruction data. The iterative feedback task generation integrates instruction generation and training tasks and utilizes information obtained during the training to create high-quality instruction sets. Our results show that SeDi-Instruct enhances the accuracy of AI models by 5.2%, compared with traditional methods, while reducing data generation costs by 36%.

SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation

TL;DR

SeDi-Instruct addresses data scarcity and quality in instruction tuning by combining diversity-based filtering with iterative feedback task generation, enabling high-quality seed instructions at reduced cost. The method relaxes similarity-based filtering to preserve batch diversity and integrates training signals to iteratively refine seeds, maintaining data usefulness while lowering API usage. Empirical results show a 5.2% accuracy uplift and a 36% reduction in data-generation costs compared with Self-Instruct, with competitive performance close to the ideal Llama-3-8B-Instruct on several benchmarks. The work highlights practical benefits for industry-scale instruction tuning, though it notes safety considerations and the dependence on generation model capacity for optimal outcomes.

Abstract

The rapid evolution of Large Language Models (LLMs) has enabled the industry to develop various AI-based services. Instruction tuning is considered essential in adapting foundation models for target domains to provide high-quality services to customers. A key challenge in instruction tuning is obtaining high-quality instruction data. Self-Instruct, which automatically generates instruction data using ChatGPT APIs, alleviates the data scarcity problem. To improve the quality of instruction data, Self-Instruct discards many of the instructions generated from ChatGPT, even though it is inefficient in terms of cost owing to many useless API calls. To generate high-quality instruction data at a low cost, we propose a novel data generation framework, Self-Direct Instruction generation (SeDi-Instruct), which employs diversity-based filtering and iterative feedback task generation. Diversity-based filtering maintains model accuracy without excessively discarding low-quality generated instructions by enhancing the diversity of instructions in a batch. This reduces the cost of synthesizing instruction data. The iterative feedback task generation integrates instruction generation and training tasks and utilizes information obtained during the training to create high-quality instruction sets. Our results show that SeDi-Instruct enhances the accuracy of AI models by 5.2%, compared with traditional methods, while reducing data generation costs by 36%.

Paper Structure

This paper contains 21 sections, 12 figures, 5 tables, 1 algorithm.

Figures (12)

  • Figure 1: Filtering inefficiency problem
  • Figure 2: Goal of SeDi-Instruct
  • Figure 3: Overall organization and operations of Self-Instruct
  • Figure 4: Overall organization and operations of SeDi-Instruct
  • Figure 5: Identification of attractive batches and instructions
  • ...and 7 more figures