Dialogue Natural Language Inference
Sean Welleck, Jason Weston, Arthur Szlam, Kyunghyun Cho
TL;DR
This work reframes dialogue consistency as a natural language inference problem and introduces the Dialogue NLI dataset, enabling NLI models to assess entailment, neutrality, and contradiction between dialogue utterances and persona content. A re-ranking approach uses an NLI model trained on Dialogue NLI to penalize contradictory candidates, improving persona-consistency in downstream dialogue generation. The authors validate their method with automatic metrics across evaluation sets and with human evaluations, demonstrating reduced contradictions and improved alignment with persona content. This work provides a new dataset and a practical method for leveraging NLI to enhance real-world dialogue systems, and suggests multiple avenues for future research in integrating NLI into downstream tasks.
Abstract
Consistency is a long standing issue faced by dialogue models. In this paper, we frame the consistency of dialogue agents as natural language inference (NLI) and create a new natural language inference dataset called Dialogue NLI. We propose a method which demonstrates that a model trained on Dialogue NLI can be used to improve the consistency of a dialogue model, and evaluate the method with human evaluation and with automatic metrics on a suite of evaluation sets designed to measure a dialogue model's consistency.
