MAIN: Mutual Alignment Is Necessary for instruction tuning
Fanyi Yang, Jianfeng Liu, Xin Zhang, Haoyu Liu, Xixin Cao, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang
TL;DR
This work identifies instruction-response alignment as a critical driver of instruction-tuning quality and introduces the Mutual Alignment Framework (MAIN), which jointly optimizes instruction and response generation in a bidirectional, EM-inspired manner. MAIN uses seed data plus large unlabeled responses, with forward and reverse models guiding data synthesis, dynamic weighting to balance synthetic and seed inputs, and mutual filtering to curate high-quality pairs. Across LLaMA-2-7B, Mistral, and Qwen, MAIN achieves state-of-the-art results on benchmarks for output preference, instruction-following, and reasoning, with demonstrated robustness across architectures and multilingual settings. The results highlight alignment as a key lever for generalizable instruction tuning and provide a scalable pipeline for generating high-quality instruction-response data.
Abstract
Instruction tuning has empowered large language models (LLMs) to achieve remarkable performance, yet its success heavily depends on the availability of large-scale, high-quality instruction-response pairs. To meet this demand, various methods have been developed to synthesize data at scale. However, current methods for scaling up data generation often overlook a crucial aspect: the alignment between instructions and responses. We hypothesize that the quality of instruction-response pairs is determined not by the individual quality of each component, but by the degree of mutual alignment. To address this, we propose a Mutual Alignment Framework (MAIN) which enforces coherence between instructions and responses through mutual constraints. We demonstrate that MAIN generalizes well across model architectures and sizes, achieving state-of-the-art performance on LLaMA, Mistral, and Qwen models across diverse benchmarks. This work underscores the critical role of instruction-response alignment in enabling generalizable and high-quality instruction tuning for LLMs. All code is available from our repository.
