Chain of Thought Explanation for Dialogue State Tracking

Lin Xu; Ningxin Peng; Daquan Zhou; See-Kiong Ng; Jinlan Fu

Chain of Thought Explanation for Dialogue State Tracking

Lin Xu, Ningxin Peng, Daquan Zhou, See-Kiong Ng, Jinlan Fu

TL;DR

This paper tackles DST by introducing Chain-of-Thought-Explanation (CoTE), which couples slot-value predictions with stepwise reasoning explanations. CoTE-Coarse creates a chain of relevant dialogue utterances as a coarse reasoning trace, while CoTE-Refined uses GPT-3 paraphrasing to produce fluent explanations, both offered alongside slot values $v_i$ via a prompted PLM. Across MultiWOZ 2.2, M2M, and WOZ 2.0, CoTE variants outperform strong baselines, with the Refined version delivering the largest gains, especially on samples requiring multi-step reasoning and in low-resource settings. The results demonstrate that explicit reasoning traces can improve generalization and accuracy in DST, suggesting practical benefits for robust conversational agents and informing future research on interpretable reasoning in dialogue systems.

Abstract

Dialogue state tracking (DST) aims to record user queries and goals during a conversational interaction achieved by maintaining a predefined set of slots and their corresponding values. Current approaches decide slot values opaquely, while humans usually adopt a more deliberate approach by collecting information from relevant dialogue turns and then reasoning the appropriate values. In this work, we focus on the steps needed to figure out slot values by proposing a model named Chain-of-Thought-Explanation (CoTE) for the DST task. CoTE, which is built on the generative DST framework, is designed to create detailed explanations step by step after determining the slot values. This process leads to more accurate and reliable slot values. More-over, to improve the reasoning ability of the CoTE, we further construct more fluent and high-quality explanations with automatic paraphrasing, leading the method CoTE-refined. Experimental results on three widely recognized DST benchmarks-MultiWOZ 2.2, WoZ 2.0, and M2M-demonstrate the remarkable effectiveness of the CoTE. Furthermore, through a meticulous fine-grained analysis, we observe significant benefits of our CoTE on samples characterized by longer dialogue turns, user responses, and reasoning steps.

Chain of Thought Explanation for Dialogue State Tracking

TL;DR

via a prompted PLM. Across MultiWOZ 2.2, M2M, and WOZ 2.0, CoTE variants outperform strong baselines, with the Refined version delivering the largest gains, especially on samples requiring multi-step reasoning and in low-resource settings. The results demonstrate that explicit reasoning traces can improve generalization and accuracy in DST, suggesting practical benefits for robust conversational agents and informing future research on interpretable reasoning in dialogue systems.

Abstract

Paper Structure (35 sections, 1 equation, 3 figures, 6 tables)

This paper contains 35 sections, 1 equation, 3 figures, 6 tables.

Introduction
Related Work
Dialogue State Tracking
Pretrained Language Models.
Chain-of-Thought Reasoning
Method
Task Definition
Chain-of-Thought Explanation
Coarse Explanation
Refined Explanation
Prompt Template
Experiment Settings
Datasets
Baselines
Evaluation Metric
...and 20 more sections

Figures (3)

Figure 1: (a) A multi-step reasoning example for DST. The slot value of 'hotel-stars' at turn 3 (T3) depends on the content across T1, T2, and T3. (b) Sample ratios from various reasoning steps on MultiWOZ 2.2 (MWZ), M2M, WoZ 2.0 (WoZ) datasets. Nearly 40% samples require multi-step reasoning ($\text{step}>=2$).
Figure 2: Fine-grained analysis on four DST models on MultiWOZ 2.2 dataset. CoTE-refined outperforms DS2 and SDP on multi-step samples (e.i., step2 and step3) a lot.
Figure 3: The framework of CoTE. A DST sample is inserted to prompt templates to form inputs to PLMs. The outputs include both slot value $v_i$ and the Explanation $E_i$, which can be Coarse Explanation or Refined Explanation. The former is the concatenation of dialogue snippets extracted to be used as the explanation, while the latter is the coarse explanation narrated by GPT3.

Chain of Thought Explanation for Dialogue State Tracking

TL;DR

Abstract

Chain of Thought Explanation for Dialogue State Tracking

Authors

TL;DR

Abstract

Table of Contents

Figures (3)