A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

Ryan Aponte; Ryan A. Rossi; Shunan Guo; Franck Dernoncourt; Tong Yu; Xiang Chen; Subrata Mitra; Nedim Lipka

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

Ryan Aponte, Ryan A. Rossi, Shunan Guo, Franck Dernoncourt, Tong Yu, Xiang Chen, Subrata Mitra, Nedim Lipka

TL;DR

The paper tackles the limitation of relying on a single supervision type for fine-tuning LLMs by introducing a heterogeneous-feedback framework that unifies diverse signals into a common training set $D_{train}$ through Simple Unionization and quality-diversity filtering. It then applies standard SFT and RLHF pipelines, augmented with LoRA, to large models (e.g., LLaMA-7B) across multi-task datasets (WinoGrande, OASST, WinoGender), demonstrating reductions in bias while preserving instruction-following performance. Key findings show that heterogeneous supervision can outperform single-dataset fine-tuning, and that selective filtering by quality and diversity can match or exceed the full-data baseline, enabling efficient multi-objective alignment. The approach has practical impact for scalable, multi-task fine-tuning of LLMs, broadening the use of heterogeneous human feedback in real-world alignment pipelines.

Abstract

Large language models (LLMs) have been applied to a wide range of tasks, including text summarization, web navigation, and chatbots. They have benefitted from supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) following an unsupervised pretraining. These datasets can be difficult to collect, limited in scope, and vary in sample quality. Additionally, datasets can vary extensively in supervision format, from numerical to binary as well as multi-dimensional with many different values. We present a framework for fine-tuning LLMs using heterogeneous feedback, which has two main components. First, we combine the heterogeneous feedback data into a single supervision format, compatible with methods like SFT and RLHF. Next, given this unified feedback dataset, we extract a high-quality and diverse subset to obtain performance increases potentially exceeding the full dataset. We conduct extensive experiments to understand the effectiveness of these techniques for incorporating heterogeneous feedback, and demonstrate improvements from using a high-quality and diverse subset of the data. We find that our framework is able to improve models in multiple areas simultaneously, such as in instruction following and bias reduction.

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

TL;DR

The paper tackles the limitation of relying on a single supervision type for fine-tuning LLMs by introducing a heterogeneous-feedback framework that unifies diverse signals into a common training set

through Simple Unionization and quality-diversity filtering. It then applies standard SFT and RLHF pipelines, augmented with LoRA, to large models (e.g., LLaMA-7B) across multi-task datasets (WinoGrande, OASST, WinoGender), demonstrating reductions in bias while preserving instruction-following performance. Key findings show that heterogeneous supervision can outperform single-dataset fine-tuning, and that selective filtering by quality and diversity can match or exceed the full-data baseline, enabling efficient multi-objective alignment. The approach has practical impact for scalable, multi-task fine-tuning of LLMs, broadening the use of heterogeneous human feedback in real-world alignment pipelines.

Abstract

Paper Structure (23 sections, 3 equations, 1 figure, 1 table)

This paper contains 23 sections, 3 equations, 1 figure, 1 table.

Introduction
Framework
Primary fine-tuning dataset
Secondary fine-tuning dataset
Simple Unionization for Feedback
Quality Selection
Diversity Selection
Training
Experimental Setup
Heterogeneous Datasets
Dataset Filtering
Baselines
Metrics
Results
Quantitative Results
...and 8 more sections

Figures (1)

Figure 1: Framework. First, we concatenate the datasets into a dataset of heterogeneous feedback. We then score samples based on quality and prompt diversity, remove a fraction of the samples (a hyperparameter), forming the homogeneous dataset $D_{train}$. Standard fine-tuning methods are then applied to a pre-trained LLM.

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

TL;DR

Abstract

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

Authors

TL;DR

Abstract

Table of Contents

Figures (1)