SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning

Chenyang Zhao; Xueying Jia; Vijay Viswanathan; Tongshuang Wu; Graham Neubig

SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning

Chenyang Zhao, Xueying Jia, Vijay Viswanathan, Tongshuang Wu, Graham Neubig

TL;DR

Self-Guide introduces a self-synthesis paradigm where an LLM generates task-specific input-output data and then fine-tunes itself on this data, removing reliance on external datasets or teacher models. The method operates in a few-shot setting and employs a multi-stage generation and quality-filtering pipeline, plus parameter tuning, to produce robust task-specific experts. Empirical results on Super-NaturalInstructions V2 show substantial gains over prompting and other baselines, with finetuning on self-generated data outperforming in-context learning and aligning outputs with the true task distribution. The work highlights a scalable path to data-efficient, task-specific adaptation of LLMs, while noting limitations related to language scope and broader ethical implications.

Abstract

Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. However, prompting often leads models to make predictions with lower accuracy compared to finetuning a model with ample training data. On the other hand, while finetuning LLMs on task-specific data generally improves their performance, abundant annotated datasets are not available for all tasks. Previous work has explored generating task-specific data from state-of-the-art LLMs and using this data to finetune smaller models, but this approach requires access to a language model other than the one being trained, which introduces cost, scalability challenges, and legal hurdles associated with continuously relying on more powerful LLMs. In response to these, we propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM, then use these input-output pairs to finetune the student LLM itself. In our empirical evaluation of the Natural Instructions V2 benchmark, we find that SELF-GUIDE improves the performance of LLM by a substantial margin. Specifically, we report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics. This sheds light on the promise of self-synthesized data guiding LLMs towards becoming task-specific experts without any external learning signals.

SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 3 figures, 6 tables)

This paper contains 26 sections, 1 equation, 3 figures, 6 tables.

Introduction
Self-Guide
Data Generation
Input Generation
Output Generation
Quality Optimization: Temperature and Rule-based Filters
Quality Optimization: Parameter Tuning
Experimental Setup
Datasets
Base Model
Baselines
Results and Analysis
Comparing self-synthesized examples with gold few-shot examples
Finetuning outperforms in-context learning on synthetic data
Self-Guide aligns LMs with the correct label distribution in many cases
...and 11 more sections

Figures (3)

Figure 1: Self-Guide uses a model's ability to generate synthetic data as a vehicle to improve the model's ability to execute a task as specified by an instruction.
Figure 2: At the heart of Self-Guide lies an efficient and effective multi-stage generation mechanism, where the LM generates input-output pairs step by step. After the generation and filtering, the self-generated data are further used to finetune the LM itself. This figure describes the process for the generation tasks.
Figure 3: This figure describes the process for the classification tasks; we use a slightly modified procedure for self-generating data for classification tasks. Put simply, we first generate pseudo-labels, then generate corresponding diverse inputs, and finally generate true labels. Regarding the Input-Output Pairs Filter, a set of labels will be provided to filter out labels. Further details will be described in \ref{['Data Generation']}.

SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning

TL;DR

Abstract

SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)