Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum Learning
Yangning Li, Tingwei Lu, Yinghui Li, Yankai Chen, Wei-Chieh Huang, Wenhao Jiang, Hui Wang, Hai-Tao Zheng, Philip S. Yu
TL;DR
CAMPUS addresses the rigidity of static curriculum-tuning by introducing a dynamic, competence-aware, multi-perspective curriculum that selects sub-curricula via a perplexity-based scheduler conditioned on LLM competence. It combines four difficulty metrics, including a learned scoring model with adversarial training, to adapt the curriculum as training progresses. Empirical results across GSM8K, HumanEval, MT-Bench and multiple backbones show consistent gains over baselines and robustness when combined with other data-selection methods, with larger gains observed for bigger models. The work demonstrates practical applicability for data-efficient instruction tuning and provides a foundation for further integration with reinforcement-learning–driven data management strategies.
Abstract
Efficient instruction tuning aims to enhance the ultimate performance of large language models (LLMs) trained on a given instruction dataset. Curriculum learning as a typical data organization strategy has shown preliminary effectiveness in instruction tuning. However, current curriculum tuning methods suffer from the curriculum rigidity, since they rely solely on static heuristic difficulty metrics. These methods fail to adapt to the evolving capabilities of models during training, resulting in a fixed and potentially sub-optimal learning trajectory. To address the issue, Competence-Aware Multi-Perspective cUrriculum inStruction tuning framework termed CAMPUS is proposed. CAMPUS offers several advantages: (1) Dynamic selection for sub-curriculum. (2) Competency-aware adjustment to the curriculum schedule. (3) Multiple difficulty-based scheduling. Extensive experiments prove the superior performance of CAMPUS, compared to other state-of-the-art baselines for efficient instruction tuning.
