LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

Dongheng Li; Yongchang Hao; Lili Mou

LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

Dongheng Li, Yongchang Hao, Lili Mou

TL;DR

Empirical results demonstrate that the LLMR approach consistently outperforms traditional KD methods in different tasks and datasets, and based on a reward function induced from large language models.

Abstract

Large language models have become increasingly popular and demonstrated remarkable performance in various natural language processing (NLP) tasks. However, these models are typically computationally expensive and difficult to be deployed in resource-constrained environments. In this paper, we propose LLMR, a novel knowledge distillation (KD) method based on a reward function induced from large language models. We conducted experiments on multiple datasets in the dialogue generation and summarization tasks. Empirical results demonstrate that our LLMR approach consistently outperforms traditional KD methods in different tasks and datasets.

LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

TL;DR

Abstract

Paper Structure (7 sections, 10 equations, 2 figures, 2 tables)

This paper contains 7 sections, 10 equations, 2 figures, 2 tables.

Introduction
Related Work
Approach
Experiments
Conclusion
Acknowledgments
Bibliographical References

Figures (2)

Figure 1: Overview of the approach.
Figure 2: The averaged excess error (ExError) with respect to sequence length of different models on DailyDialog.

LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

TL;DR

Abstract

LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

Authors

TL;DR

Abstract

Table of Contents

Figures (2)