Table of Contents
Fetching ...

Chain of Thought Explanation for Dialogue State Tracking

Lin Xu, Ningxin Peng, Daquan Zhou, See-Kiong Ng, Jinlan Fu

TL;DR

This paper tackles DST by introducing Chain-of-Thought-Explanation (CoTE), which couples slot-value predictions with stepwise reasoning explanations. CoTE-Coarse creates a chain of relevant dialogue utterances as a coarse reasoning trace, while CoTE-Refined uses GPT-3 paraphrasing to produce fluent explanations, both offered alongside slot values $v_i$ via a prompted PLM. Across MultiWOZ 2.2, M2M, and WOZ 2.0, CoTE variants outperform strong baselines, with the Refined version delivering the largest gains, especially on samples requiring multi-step reasoning and in low-resource settings. The results demonstrate that explicit reasoning traces can improve generalization and accuracy in DST, suggesting practical benefits for robust conversational agents and informing future research on interpretable reasoning in dialogue systems.

Abstract

Dialogue state tracking (DST) aims to record user queries and goals during a conversational interaction achieved by maintaining a predefined set of slots and their corresponding values. Current approaches decide slot values opaquely, while humans usually adopt a more deliberate approach by collecting information from relevant dialogue turns and then reasoning the appropriate values. In this work, we focus on the steps needed to figure out slot values by proposing a model named Chain-of-Thought-Explanation (CoTE) for the DST task. CoTE, which is built on the generative DST framework, is designed to create detailed explanations step by step after determining the slot values. This process leads to more accurate and reliable slot values. More-over, to improve the reasoning ability of the CoTE, we further construct more fluent and high-quality explanations with automatic paraphrasing, leading the method CoTE-refined. Experimental results on three widely recognized DST benchmarks-MultiWOZ 2.2, WoZ 2.0, and M2M-demonstrate the remarkable effectiveness of the CoTE. Furthermore, through a meticulous fine-grained analysis, we observe significant benefits of our CoTE on samples characterized by longer dialogue turns, user responses, and reasoning steps.

Chain of Thought Explanation for Dialogue State Tracking

TL;DR

This paper tackles DST by introducing Chain-of-Thought-Explanation (CoTE), which couples slot-value predictions with stepwise reasoning explanations. CoTE-Coarse creates a chain of relevant dialogue utterances as a coarse reasoning trace, while CoTE-Refined uses GPT-3 paraphrasing to produce fluent explanations, both offered alongside slot values via a prompted PLM. Across MultiWOZ 2.2, M2M, and WOZ 2.0, CoTE variants outperform strong baselines, with the Refined version delivering the largest gains, especially on samples requiring multi-step reasoning and in low-resource settings. The results demonstrate that explicit reasoning traces can improve generalization and accuracy in DST, suggesting practical benefits for robust conversational agents and informing future research on interpretable reasoning in dialogue systems.

Abstract

Dialogue state tracking (DST) aims to record user queries and goals during a conversational interaction achieved by maintaining a predefined set of slots and their corresponding values. Current approaches decide slot values opaquely, while humans usually adopt a more deliberate approach by collecting information from relevant dialogue turns and then reasoning the appropriate values. In this work, we focus on the steps needed to figure out slot values by proposing a model named Chain-of-Thought-Explanation (CoTE) for the DST task. CoTE, which is built on the generative DST framework, is designed to create detailed explanations step by step after determining the slot values. This process leads to more accurate and reliable slot values. More-over, to improve the reasoning ability of the CoTE, we further construct more fluent and high-quality explanations with automatic paraphrasing, leading the method CoTE-refined. Experimental results on three widely recognized DST benchmarks-MultiWOZ 2.2, WoZ 2.0, and M2M-demonstrate the remarkable effectiveness of the CoTE. Furthermore, through a meticulous fine-grained analysis, we observe significant benefits of our CoTE on samples characterized by longer dialogue turns, user responses, and reasoning steps.
Paper Structure (35 sections, 1 equation, 3 figures, 6 tables)

This paper contains 35 sections, 1 equation, 3 figures, 6 tables.

Figures (3)

  • Figure 1: (a) A multi-step reasoning example for DST. The slot value of 'hotel-stars' at turn 3 (T3) depends on the content across T1, T2, and T3. (b) Sample ratios from various reasoning steps on MultiWOZ 2.2 (MWZ), M2M, WoZ 2.0 (WoZ) datasets. Nearly 40% samples require multi-step reasoning ($\text{step}>=2$).
  • Figure 2: Fine-grained analysis on four DST models on MultiWOZ 2.2 dataset. CoTE-refined outperforms DS2 and SDP on multi-step samples (e.i., step2 and step3) a lot.
  • Figure 3: The framework of CoTE. A DST sample is inserted to prompt templates to form inputs to PLMs. The outputs include both slot value $v_i$ and the Explanation $E_i$, which can be Coarse Explanation or Refined Explanation. The former is the concatenation of dialogue snippets extracted to be used as the explanation, while the latter is the coarse explanation narrated by GPT3.