Table of Contents
Fetching ...

Small But Funny: A Feedback-Driven Approach to Humor Distillation

Sahithya Ravi, Patrick Huber, Akshat Shrivastava, Aditya Sagar, Ahmed Aly, Vered Shwartz, Arash Einolghozati

TL;DR

This work tackles the challenge of distilling creative abilities, specifically humor, from large language models (LLMs) into smaller models (SLMs) by introducing a dual-role framework where the teacher also acts as a critic. The method combines an imitation phase, where the SLM learns from multiple humorous paraphrases generated by the LLM, with a critique phase that uses pairwise LLM-based feedback (via Multiple Choice Prompting) to steer improvements through BRIO and DPO-style objectives. Empirical results show that feedback-guided students close the gap to their teacher, achieving up to $65\%$ of teacher performance and surpassing supervised fine-tuning by about $18$–$20\%$, while exhibiting substantial alignment with human judgments (up to $76\%$ for humor). The findings demonstrate data-efficient, feedback-driven distillation as a viable approach for deploying humorous generation in latency- and resource-constrained settings, while also highlighting biases in LLM-based evaluators and the need for robust bias mitigation.

Abstract

The emergence of Large Language Models (LLMs) has brought to light promising language generation capabilities, particularly in performing tasks like complex reasoning and creative writing. Consequently, distillation through imitation of teacher responses has emerged as a popular technique to transfer knowledge from LLMs to more accessible, Small Language Models (SLMs). While this works well for simpler tasks, there is a substantial performance gap on tasks requiring intricate language comprehension and creativity, such as humor generation. We hypothesize that this gap may stem from the fact that creative tasks might be hard to learn by imitation alone and explore whether an approach, involving supplementary guidance from the teacher, could yield higher performance. To address this, we study the effect of assigning a dual role to the LLM - as a "teacher" generating data, as well as a "critic" evaluating the student's performance. Our experiments on humor generation reveal that the incorporation of feedback significantly narrows the performance gap between SLMs and their larger counterparts compared to merely relying on imitation. As a result, our research highlights the potential of using feedback as an additional dimension to data when transferring complex language abilities via distillation.

Small But Funny: A Feedback-Driven Approach to Humor Distillation

TL;DR

This work tackles the challenge of distilling creative abilities, specifically humor, from large language models (LLMs) into smaller models (SLMs) by introducing a dual-role framework where the teacher also acts as a critic. The method combines an imitation phase, where the SLM learns from multiple humorous paraphrases generated by the LLM, with a critique phase that uses pairwise LLM-based feedback (via Multiple Choice Prompting) to steer improvements through BRIO and DPO-style objectives. Empirical results show that feedback-guided students close the gap to their teacher, achieving up to of teacher performance and surpassing supervised fine-tuning by about , while exhibiting substantial alignment with human judgments (up to for humor). The findings demonstrate data-efficient, feedback-driven distillation as a viable approach for deploying humorous generation in latency- and resource-constrained settings, while also highlighting biases in LLM-based evaluators and the need for robust bias mitigation.

Abstract

The emergence of Large Language Models (LLMs) has brought to light promising language generation capabilities, particularly in performing tasks like complex reasoning and creative writing. Consequently, distillation through imitation of teacher responses has emerged as a popular technique to transfer knowledge from LLMs to more accessible, Small Language Models (SLMs). While this works well for simpler tasks, there is a substantial performance gap on tasks requiring intricate language comprehension and creativity, such as humor generation. We hypothesize that this gap may stem from the fact that creative tasks might be hard to learn by imitation alone and explore whether an approach, involving supplementary guidance from the teacher, could yield higher performance. To address this, we study the effect of assigning a dual role to the LLM - as a "teacher" generating data, as well as a "critic" evaluating the student's performance. Our experiments on humor generation reveal that the incorporation of feedback significantly narrows the performance gap between SLMs and their larger counterparts compared to merely relying on imitation. As a result, our research highlights the potential of using feedback as an additional dimension to data when transferring complex language abilities via distillation.
Paper Structure (40 sections, 4 figures, 6 tables)

This paper contains 40 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Performance gap between LLMs and SLMs: Generations from a teacher LLM (Llama2) and a student SLM (BART) finetuned on its outputs.
  • Figure 2: The proposed knowledge distillation framework: We perform task-specific distillation from a large, general language model, in two phases: an initial imitation phase, followed by a critical feedback phase which controls the quality of the generated humorous outputs from the student.
  • Figure 3: Humor Generation
  • Figure 4: Pairwise evaluation using Multiple Choice Prompting