Table of Contents
Fetching ...

AIR: Complex Instruction Generation via Automatic Iterative Refinement

Wei Liu, Yancheng He, Hui Huang, Chengwei Hu, Jiaheng Liu, Shilong Li, Wenbo Su, Bo Zheng

TL;DR

The paper tackles the difficulty of generating complex instructions for LLMs by introducing AIR, a two-stage framework that combines document-grounded instruction generation with iterative refinement guided by an LLM-as-judge. It generates AIR-10K, a large, domain-diverse complex-instruction dataset, and demonstrates that AIR-based fine-tuning yields superior performance on complex and general instruction-following benchmarks compared with existing methods. The work provides detailed analyses of document sampling, judgment strategies, data quantity, and guidance-model size, highlighting practical considerations for building robust instruction-following systems. The findings suggest that grounding instructions in real documents and iterative constraint refinement can substantially improve alignment with real-world usage, with implications for scalable, high-quality instruction data generation.

Abstract

With the development of large language models, their ability to follow simple instructions has significantly improved. However, adhering to complex instructions remains a major challenge. Current approaches to generating complex instructions are often irrelevant to the current instruction requirements or suffer from limited scalability and diversity. Moreover, methods such as back-translation, while effective for simple instruction generation, fail to leverage the rich contents and structures in large web corpora. In this paper, we propose a novel automatic iterative refinement framework to generate complex instructions with constraints, which not only better reflects the requirements of real scenarios but also significantly enhances LLMs' ability to follow complex instructions. The AIR framework consists of two stages: (1)Generate an initial instruction from a document; (2)Iteratively refine instructions with LLM-as-judge guidance by comparing the model's output with the document to incorporate valuable constraints. Finally, we construct the AIR-10K dataset with 10K complex instructions and demonstrate that instructions generated with our approach significantly improve the model's ability to follow complex instructions, outperforming existing methods for instruction generation.

AIR: Complex Instruction Generation via Automatic Iterative Refinement

TL;DR

The paper tackles the difficulty of generating complex instructions for LLMs by introducing AIR, a two-stage framework that combines document-grounded instruction generation with iterative refinement guided by an LLM-as-judge. It generates AIR-10K, a large, domain-diverse complex-instruction dataset, and demonstrates that AIR-based fine-tuning yields superior performance on complex and general instruction-following benchmarks compared with existing methods. The work provides detailed analyses of document sampling, judgment strategies, data quantity, and guidance-model size, highlighting practical considerations for building robust instruction-following systems. The findings suggest that grounding instructions in real documents and iterative constraint refinement can substantially improve alignment with real-world usage, with implications for scalable, high-quality instruction data generation.

Abstract

With the development of large language models, their ability to follow simple instructions has significantly improved. However, adhering to complex instructions remains a major challenge. Current approaches to generating complex instructions are often irrelevant to the current instruction requirements or suffer from limited scalability and diversity. Moreover, methods such as back-translation, while effective for simple instruction generation, fail to leverage the rich contents and structures in large web corpora. In this paper, we propose a novel automatic iterative refinement framework to generate complex instructions with constraints, which not only better reflects the requirements of real scenarios but also significantly enhances LLMs' ability to follow complex instructions. The AIR framework consists of two stages: (1)Generate an initial instruction from a document; (2)Iteratively refine instructions with LLM-as-judge guidance by comparing the model's output with the document to incorporate valuable constraints. Finally, we construct the AIR-10K dataset with 10K complex instructions and demonstrate that instructions generated with our approach significantly improve the model's ability to follow complex instructions, outperforming existing methods for instruction generation.

Paper Structure

This paper contains 30 sections, 16 figures, 8 tables, 2 algorithms.

Figures (16)

  • Figure 1: Illustration of how humans iteratively refine instructions to be more complex.
  • Figure 2: AIR: Automatic Iterative Refinement Framework.
  • Figure 3: Data statistics of AIR-10K.
  • Figure 4: Length distribution of AIR-10K.
  • Figure 5: Comparison of averaged complexity and quality scores on different datasets.
  • ...and 11 more figures