Self-training Large Language Models through Knowledge Detection

Wei Jie Yeo; Teddy Ferdinan; Przemyslaw Kazienko; Ranjan Satapathy; Erik Cambria

Self-training Large Language Models through Knowledge Detection

Wei Jie Yeo, Teddy Ferdinan, Przemyslaw Kazienko, Ranjan Satapathy, Erik Cambria

TL;DR

A self-training paradigm, where the LLM autonomously curates its own labels and selectively trains on unknown data samples identified through a reference-free consistency method is explored, suggesting that such an approach can substantially reduce the dependency on large labeled datasets, paving the way for more scalable and cost-effective language model training.

Abstract

Large language models (LLMs) often necessitate extensive labeled datasets and training compute to achieve impressive performance across downstream tasks. This paper explores a self-training paradigm, where the LLM autonomously curates its own labels and selectively trains on unknown data samples identified through a reference-free consistency method. Empirical evaluations demonstrate significant improvements in reducing hallucination in generation across multiple subjects. Furthermore, the selective training framework mitigates catastrophic forgetting in out-of-distribution benchmarks, addressing a critical limitation in training LLMs. Our findings suggest that such an approach can substantially reduce the dependency on large labeled datasets, paving the way for more scalable and cost-effective language model training.

Self-training Large Language Models through Knowledge Detection

TL;DR

Abstract

Paper Structure (22 sections, 7 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 7 equations, 5 figures, 6 tables, 1 algorithm.

Introduction
Related work
Self-training
Instruction generation
SFT stage
Preference Labeling
Knowledge Filtering
Experiments
Dataset
Model
Experiment details
Results
Impact of Self-training on Truthfulness
Catastrophic Forgetting
Varying Filtering Theshold
...and 7 more sections

Figures (5)

Figure 1: An overview of the self-training framework, instruction generation (1), SFT stage (2), preference labeling (3) and knowledge filtering (4). The four steps are implemented in sequence and the final model is assessed for truthfulness.
Figure 2: Win-Tie-Lose on main held-out questions based on Wikipedia documents. Left pertains to TinyLlama-1.1B, middle to Llama2-7B and right refers to 13B. Scores are evaluated based on pairwise comparison using GPT-4 as the evaluator and all approaches are compared against the respective SFT model.
Figure 3: Percentage of losing rate on 200 randomly sampled instances classified as known. All approaches are compared against $\pi_{SFT}$.
Figure 4: Effects of varying $\tau_K$ on the win rate. Dashed lines shows the results without performing knowledge filtering for each model.
Figure 5: Impact of varying $K$ to approximate the average contradiction score. The value of $K$ affects the number of responses used to compute both $S_L$ and $S_K$.

Self-training Large Language Models through Knowledge Detection

TL;DR

Abstract

Self-training Large Language Models through Knowledge Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (5)