Table of Contents
Fetching ...

AdvancedIF: Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

Yun He, Wenzhe Li, Hejia Zhang, Songlin Li, Karishma Mandyam, Sopan Khosla, Yuanhao Xiong, Nanshu Wang, Xiaoliang Peng, Beibin Li, Shengjie Bi, Shishir G. Patil, Qi Qi, Shengyu Feng, Julian Katz-Samuels, Richard Yuanzhe Pang, Sujan Gonugondla, Hunter Lang, Yue Yu, Yundi Qian, Maryam Fazel-Zarandi, Licheng Yu, Amine Benhalloum, Hany Awadalla, Manaal Faruqui

TL;DR

This work tackles the challenge of advanced instruction following in LLMs by introducing AdvancedIF, a large, expert-curated benchmark with rubric-based evaluation for complex, multi-turn and system-prompt tasks. It then proposes RIFL, a full-stack rubric-based RL pipeline that (i) generates high-quality rubrics, (ii) trains a specialized rubric verifier, and (iii) shapes rewards to minimize hacking while maximizing instruction-following performance. Empirical results show that RIFL yields a 6.7% absolute improvement on AdvancedIF and gains on public benchmarks, with extensive ablations confirming the value of each component. The study positions rubrics as a powerful, interpretable tool for both training and evaluating higher-level instruction following in LLMs, enabling more capable and reliable AI systems.

Abstract

Recent progress in large language models (LLMs) has led to impressive performance on a range of tasks, yet advanced instruction following (IF)-especially for complex, multi-turn, and system-prompted instructions-remains a significant challenge. Rigorous evaluation and effective training for such capabilities are hindered by the lack of high-quality, human-annotated benchmarks and reliable, interpretable reward signals. In this work, we introduce AdvancedIF (we will release this benchmark soon), a comprehensive benchmark featuring over 1,600 prompts and expert-curated rubrics that assess LLMs ability to follow complex, multi-turn, and system-level instructions. We further propose RIFL (Rubric-based Instruction-Following Learning), a novel post-training pipeline that leverages rubric generation, a finetuned rubric verifier, and reward shaping to enable effective reinforcement learning for instruction following. Extensive experiments demonstrate that RIFL substantially improves the instruction-following abilities of LLMs, achieving a 6.7% absolute gain on AdvancedIF and strong results on public benchmarks. Our ablation studies confirm the effectiveness of each component in RIFL. This work establishes rubrics as a powerful tool for both training and evaluating advanced IF in LLMs, paving the way for more capable and reliable AI systems.

AdvancedIF: Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

TL;DR

This work tackles the challenge of advanced instruction following in LLMs by introducing AdvancedIF, a large, expert-curated benchmark with rubric-based evaluation for complex, multi-turn and system-prompt tasks. It then proposes RIFL, a full-stack rubric-based RL pipeline that (i) generates high-quality rubrics, (ii) trains a specialized rubric verifier, and (iii) shapes rewards to minimize hacking while maximizing instruction-following performance. Empirical results show that RIFL yields a 6.7% absolute improvement on AdvancedIF and gains on public benchmarks, with extensive ablations confirming the value of each component. The study positions rubrics as a powerful, interpretable tool for both training and evaluating higher-level instruction following in LLMs, enabling more capable and reliable AI systems.

Abstract

Recent progress in large language models (LLMs) has led to impressive performance on a range of tasks, yet advanced instruction following (IF)-especially for complex, multi-turn, and system-prompted instructions-remains a significant challenge. Rigorous evaluation and effective training for such capabilities are hindered by the lack of high-quality, human-annotated benchmarks and reliable, interpretable reward signals. In this work, we introduce AdvancedIF (we will release this benchmark soon), a comprehensive benchmark featuring over 1,600 prompts and expert-curated rubrics that assess LLMs ability to follow complex, multi-turn, and system-level instructions. We further propose RIFL (Rubric-based Instruction-Following Learning), a novel post-training pipeline that leverages rubric generation, a finetuned rubric verifier, and reward shaping to enable effective reinforcement learning for instruction following. Extensive experiments demonstrate that RIFL substantially improves the instruction-following abilities of LLMs, achieving a 6.7% absolute gain on AdvancedIF and strong results on public benchmarks. Our ablation studies confirm the effectiveness of each component in RIFL. This work establishes rubrics as a powerful tool for both training and evaluating advanced IF in LLMs, paving the way for more capable and reliable AI systems.

Paper Structure

This paper contains 21 sections, 1 equation, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Example of AdvancedIF benchmark. The example is from the capability of multi-turn carried context where prompts and rubrics written by human experts.
  • Figure 2: Framework of RIFL.
  • Figure 3: Example of rubric verification training data.
  • Figure 4: RL of rubric verifier training in \ref{['sec:method-verifier']}. The reward is computed as the ratio of agreement between the verified results and expert labels across each criterion.