Table of Contents
Fetching ...

Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps

Han Wang, Yilin Zhao, Dian Li, Xiaohan Wang, Gang Liu, Xuguang Lan, Hui Wang

TL;DR

This work tackles the challenge of humor generation by large language models, which requires multi-hop reasoning and access to broad knowledge. It introduces LoL, a two-stage framework that combines supervised fine-tuning on judgment-oriented data with direct preference optimization, augmented by automatic instruction evolution (AIE) through a three-agent system and guided explorative self-improvement tuning (GESIT). External knowledge injection and structured thought processes are used to deepen humor understanding and improve generation, with rationale extraction via GPT-4o guiding preference data. Experiments on English and Chinese humor benchmarks and the Divergent Association Task (DAT) demonstrate state-of-the-art judgment and enhanced creative-generation capabilities, suggesting LoL’s potential to boost cross-domain creative applications of LLMs.

Abstract

Humor is previously regarded as a gift exclusive to humans for the following reasons. Humor is a culturally nuanced aspect of human language, presenting challenges for its understanding and generation. Humor generation necessitates a multi-hop reasoning process, with each hop founded on proper rationales. Although many studies, such as those related to GPT-o1, focus on logical reasoning with reflection and correction, they still fall short in humor generation. Due to the sparsity of the knowledge graph in creative thinking, it is arduous to achieve multi-hop reasoning. Consequently, in this paper, we propose a more robust framework for addressing the humor reasoning task, named LoL. LoL aims to inject external information to mitigate the sparsity of the knowledge graph, thereby enabling multi-hop reasoning. In the first stage of LoL, we put forward an automatic instruction-evolution method to incorporate the deeper and broader thinking processes underlying humor. Judgment-oriented instructions are devised to enhance the model's judgment capability, dynamically supplementing and updating the sparse knowledge graph. Subsequently, through reinforcement learning, the reasoning logic for each online-generated response is extracted using GPT-4o. In this process, external knowledge is re-introduced to aid the model in logical reasoning and the learning of human preferences. Finally, experimental results indicate that the combination of these two processes can enhance both the model's judgment ability and its generative capacity. These findings deepen our comprehension of the creative capabilities of large language models (LLMs) and offer approaches to boost LLMs' creative abilities for cross-domain innovative applications.

Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps

TL;DR

This work tackles the challenge of humor generation by large language models, which requires multi-hop reasoning and access to broad knowledge. It introduces LoL, a two-stage framework that combines supervised fine-tuning on judgment-oriented data with direct preference optimization, augmented by automatic instruction evolution (AIE) through a three-agent system and guided explorative self-improvement tuning (GESIT). External knowledge injection and structured thought processes are used to deepen humor understanding and improve generation, with rationale extraction via GPT-4o guiding preference data. Experiments on English and Chinese humor benchmarks and the Divergent Association Task (DAT) demonstrate state-of-the-art judgment and enhanced creative-generation capabilities, suggesting LoL’s potential to boost cross-domain creative applications of LLMs.

Abstract

Humor is previously regarded as a gift exclusive to humans for the following reasons. Humor is a culturally nuanced aspect of human language, presenting challenges for its understanding and generation. Humor generation necessitates a multi-hop reasoning process, with each hop founded on proper rationales. Although many studies, such as those related to GPT-o1, focus on logical reasoning with reflection and correction, they still fall short in humor generation. Due to the sparsity of the knowledge graph in creative thinking, it is arduous to achieve multi-hop reasoning. Consequently, in this paper, we propose a more robust framework for addressing the humor reasoning task, named LoL. LoL aims to inject external information to mitigate the sparsity of the knowledge graph, thereby enabling multi-hop reasoning. In the first stage of LoL, we put forward an automatic instruction-evolution method to incorporate the deeper and broader thinking processes underlying humor. Judgment-oriented instructions are devised to enhance the model's judgment capability, dynamically supplementing and updating the sparse knowledge graph. Subsequently, through reinforcement learning, the reasoning logic for each online-generated response is extracted using GPT-4o. In this process, external knowledge is re-introduced to aid the model in logical reasoning and the learning of human preferences. Finally, experimental results indicate that the combination of these two processes can enhance both the model's judgment ability and its generative capacity. These findings deepen our comprehension of the creative capabilities of large language models (LLMs) and offer approaches to boost LLMs' creative abilities for cross-domain innovative applications.

Paper Structure

This paper contains 18 sections, 18 figures, 7 tables, 3 algorithms.

Figures (18)

  • Figure 1: English comparison showcase (more showcases are in Appendix \ref{['showcase']}). Compared to GPT-4o and CLoT, LoL provides shorter and more conversational answers to questions. For instance in Case 2, while LoL and CLoT may convey the same meaning, their different expressions produce different effects. Brief responses leave room for readers to ponder, enhancing interest and interactivity.
  • Figure 2: The details of judgement-oriented instructions templates.
  • Figure 3: The detail of AIE. And the detailed process is shown in Algorithm \ref{['alg-aaie']}
  • Figure 4: Pipeline of training process.
  • Figure 5: Divergent associate thinking (DAT) validate. (a). DAT score compared among LoL and three baselines (b). DAT score compared among different component of LoL (c). TSNE Results of Word Vectors Obtained by QWEN1.5-32B Associating Five Target Words. (d). TSNE Results of Word Vectors Obtained by LoL Associating Five Target Words.
  • ...and 13 more figures