Table of Contents
Fetching ...

Think Before You Speak: Explicitly Generating Implicit Commonsense Knowledge for Response Generation

Pei Zhou, Karthik Gopalakrishnan, Behnam Hedayatnia, Seokhwan Kim, Jay Pujara, Xiang Ren, Yang Liu, Dilek Hakkani-Tur

TL;DR

The paper introduces Think-Before-Speaking (TBS), a framework that explicitly generates implicit commonsense knowledge prior to response generation in open-domain dialogue. By coupling a knowledge-generation step with the response generator, TBS achieves more informative and contextually grounded responses and offers a faithful explanation of its intent. It builds weakly supervised, knowledge-aligned dialogues from ConceptNet, explores two NL knowledge representations, and demonstrates strong gains over end-to-end RG and several knowledge-augmented baselines, including human judgments and knowledge-grounding evidence. The results indicate that externalizing implicit knowledge can improve learning efficiency, generation quality, and interpretability, while also enabling the model to produce novel, relevant knowledge. This work suggests a promising direction for more human-like grounding in conversational AI and highlights the importance of knowledge quality and structured representations in grounding decisions.

Abstract

Implicit knowledge, such as common sense, is key to fluid human conversations. Current neural response generation (RG) models are trained to generate responses directly, omitting unstated implicit knowledge. In this paper, we present Think-Before-Speaking (TBS), a generative approach to first externalize implicit commonsense knowledge (think) and use this knowledge to generate responses (speak). We expect that externalizing implicit knowledge allows more efficient learning, produces more informative responses, and enables more explainable models. We analyze different choices to collect knowledge-aligned dialogues, represent implicit knowledge, and transition between knowledge and dialogues. Empirical results show TBS models outperform end-to-end and knowledge-augmented RG baselines on most automatic metrics and generate more informative, specific, and commonsense-following responses, as evaluated by human annotators. TBS also generates knowledge that makes sense and is relevant to the dialogue around 85\% of the time.

Think Before You Speak: Explicitly Generating Implicit Commonsense Knowledge for Response Generation

TL;DR

The paper introduces Think-Before-Speaking (TBS), a framework that explicitly generates implicit commonsense knowledge prior to response generation in open-domain dialogue. By coupling a knowledge-generation step with the response generator, TBS achieves more informative and contextually grounded responses and offers a faithful explanation of its intent. It builds weakly supervised, knowledge-aligned dialogues from ConceptNet, explores two NL knowledge representations, and demonstrates strong gains over end-to-end RG and several knowledge-augmented baselines, including human judgments and knowledge-grounding evidence. The results indicate that externalizing implicit knowledge can improve learning efficiency, generation quality, and interpretability, while also enabling the model to produce novel, relevant knowledge. This work suggests a promising direction for more human-like grounding in conversational AI and highlights the importance of knowledge quality and structured representations in grounding decisions.

Abstract

Implicit knowledge, such as common sense, is key to fluid human conversations. Current neural response generation (RG) models are trained to generate responses directly, omitting unstated implicit knowledge. In this paper, we present Think-Before-Speaking (TBS), a generative approach to first externalize implicit commonsense knowledge (think) and use this knowledge to generate responses (speak). We expect that externalizing implicit knowledge allows more efficient learning, produces more informative responses, and enables more explainable models. We analyze different choices to collect knowledge-aligned dialogues, represent implicit knowledge, and transition between knowledge and dialogues. Empirical results show TBS models outperform end-to-end and knowledge-augmented RG baselines on most automatic metrics and generate more informative, specific, and commonsense-following responses, as evaluated by human annotators. TBS also generates knowledge that makes sense and is relevant to the dialogue around 85\% of the time.

Paper Structure

This paper contains 46 sections, 3 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: A motivating example for our study. We look to train models to externalize the implicit knowledge grounding step by explicitly generating knowledge before responding.
  • Figure 2: Method illustration. We first propose matching approaches to construct knowledge-aligned dialogues. Then we consider different alternatives to represent implicit knowledge. Finally, we connect knowledge and dialogue and ask models to generate both knowledge and responses given history.
  • Figure 3: Human evaluation results for pairwise comparison between TBS and a baseline. We show preference percentages for each model. "*" indicates statistical significance difference. For TBS we show averaged preferences.
  • Figure 4: Human evaluation comparing TBS with models that have access to ground-truth responses.
  • Figure 5: Effects of noisy knowledge on response quality.
  • ...and 4 more figures