Continual Dialogue State Tracking via Reason-of-Select Distillation

Yujie Feng; Bo Liu; Xiaoyu Dong; Zexin Lu; Li-Ming Zhan; Albert Y. S. Lam; Xiao-Ming Wu

Continual Dialogue State Tracking via Reason-of-Select Distillation

Yujie Feng, Bo Liu, Xiaoyu Dong, Zexin Lu, Li-Ming Zhan, Albert Y. S. Lam, Xiao-Ming Wu

TL;DR

The Reason-of-Select (RoS) distillation method is introduced by enhancing smaller models with a novel 'meta-reasoning' capability, significantly enhance RoS by generating DST-specific selection chains and mitigating hallucinations in teachers' reasoning, ensuring effective and reliable knowledge transfer.

Abstract

An ideal dialogue system requires continuous skill acquisition and adaptation to new tasks while retaining prior knowledge. Dialogue State Tracking (DST), vital in these systems, often involves learning new services and confronting catastrophic forgetting, along with a critical capability loss termed the "Value Selection Quandary." To address these challenges, we introduce the Reason-of-Select (RoS) distillation method by enhancing smaller models with a novel 'meta-reasoning' capability. Meta-reasoning employs an enhanced multi-domain perspective, combining fragments of meta-knowledge from domain-specific dialogues during continual learning. This transcends traditional single-perspective reasoning. The domain bootstrapping process enhances the model's ability to dissect intricate dialogues from multiple possible values. Its domain-agnostic property aligns data distribution across different domains, effectively mitigating forgetting. Additionally, two novel improvements, "multi-value resolution" strategy and Semantic Contrastive Reasoning Selection method, significantly enhance RoS by generating DST-specific selection chains and mitigating hallucinations in teachers' reasoning, ensuring effective and reliable knowledge transfer. Extensive experiments validate the exceptional performance and robust generalization capabilities of our method. The source code is provided for reproducibility.

Continual Dialogue State Tracking via Reason-of-Select Distillation

TL;DR

Abstract

Paper Structure (33 sections, 3 equations, 7 figures, 9 tables)

This paper contains 33 sections, 3 equations, 7 figures, 9 tables.

Introduction
Motivation for Boosting Meta-reasoning Capability in Continual DST
Pinpointing the Value Selection Quandary
Bridging the Gap of Reasoning Ability
Reason-of-Select Distillation
Problem Formulation
Overview
Teacher's Reasoning Generation
Ensuring Faithful Teaching with Semantic Contrastive Reasoning Selection
Training Student Models via Reasoning-Enhanced Data
Experiments
Experimental Setup
Dataset
Evaluation Protocol
Baselines
...and 18 more sections

Figures (7)

Figure 1: Left: Depiction of the Continual DST learning process. Right: An actual instance of the "Value Selection Quandary" phenomenon, demonstrating a dialogue with three mentioned date values, where the model incorrectly chooses the most recent time at turn 7 rather than the correct value at turn 6.
Figure 2: Performance analysis of LLaMA-7B and T5-small in Continual DST task.
Figure 3: Overview of the Reason-of-Select (RoS) Distillation method. (a) Teacher: A large LM prompted to generate a faithful rationale given a dialogue context and the value for the request slot in the training set via the "multi-value resolution" strategy and Semantic Contrastive Reasoning Selection method. (b) Student: A small LM is fine-tuned to generate an accurate rationale and the corresponding value.
Figure 4: Demonstration of value-level and slot-level perturbations to elicit diverse negative reasonings.
Figure 5: Illustration of the Semantic Contrastive Reasoning Selection method.
...and 2 more figures

Continual Dialogue State Tracking via Reason-of-Select Distillation

TL;DR

Abstract

Continual Dialogue State Tracking via Reason-of-Select Distillation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)