Table of Contents
Fetching ...

Beyond Ontology in Dialogue State Tracking for Goal-Oriented Chatbot

Sejin Lee, Dongha Kim, Min Song

TL;DR

This paper tackles dialogue state tracking (DST) for goal-oriented chatbots without predefined ontologies by combining instruction-tuned LLMs with sophisticated prompt strategies and an anti-hallucination mechanism. A VGAE graph-based stage then predicts the next dialogue state from the inferred states, enabling open-domain applicability. The approach achieves state-of-the-art ontology-free DST performance (e.g., JGA up to 42.57% on MultiWOZ 2.0 and strong SGD results) and demonstrates high-quality next-state predictions with VGAE (AUC ~98.6%, AP ~99%). These results indicate a significant step toward adaptable, scalable DST that maintains accuracy in open-domain conversations and real-world data.

Abstract

Goal-oriented chatbots are essential for automating user tasks, such as booking flights or making restaurant reservations. A key component of these systems is Dialogue State Tracking (DST), which interprets user intent and maintains the dialogue state. However, existing DST methods often rely on fixed ontologies and manually compiled slot values, limiting their adaptability to open-domain dialogues. We propose a novel approach that leverages instruction tuning and advanced prompt strategies to enhance DST performance, without relying on any predefined ontologies. Our method enables Large Language Model (LLM) to infer dialogue states through carefully designed prompts and includes an anti-hallucination mechanism to ensure accurate tracking in diverse conversation contexts. Additionally, we employ a Variational Graph Auto-Encoder (VGAE) to model and predict subsequent user intent. Our approach achieved state-of-the-art with a JGA of 42.57% outperforming existing ontology-less DST models, and performed well in open-domain real-world conversations. This work presents a significant advancement in creating more adaptive and accurate goal-oriented chatbots.

Beyond Ontology in Dialogue State Tracking for Goal-Oriented Chatbot

TL;DR

This paper tackles dialogue state tracking (DST) for goal-oriented chatbots without predefined ontologies by combining instruction-tuned LLMs with sophisticated prompt strategies and an anti-hallucination mechanism. A VGAE graph-based stage then predicts the next dialogue state from the inferred states, enabling open-domain applicability. The approach achieves state-of-the-art ontology-free DST performance (e.g., JGA up to 42.57% on MultiWOZ 2.0 and strong SGD results) and demonstrates high-quality next-state predictions with VGAE (AUC ~98.6%, AP ~99%). These results indicate a significant step toward adaptable, scalable DST that maintains accuracy in open-domain conversations and real-world data.

Abstract

Goal-oriented chatbots are essential for automating user tasks, such as booking flights or making restaurant reservations. A key component of these systems is Dialogue State Tracking (DST), which interprets user intent and maintains the dialogue state. However, existing DST methods often rely on fixed ontologies and manually compiled slot values, limiting their adaptability to open-domain dialogues. We propose a novel approach that leverages instruction tuning and advanced prompt strategies to enhance DST performance, without relying on any predefined ontologies. Our method enables Large Language Model (LLM) to infer dialogue states through carefully designed prompts and includes an anti-hallucination mechanism to ensure accurate tracking in diverse conversation contexts. Additionally, we employ a Variational Graph Auto-Encoder (VGAE) to model and predict subsequent user intent. Our approach achieved state-of-the-art with a JGA of 42.57% outperforming existing ontology-less DST models, and performed well in open-domain real-world conversations. This work presents a significant advancement in creating more adaptive and accurate goal-oriented chatbots.

Paper Structure

This paper contains 24 sections, 3 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: An example of tracking user utterances, representing them as a graph, and predicting the next slot-value. Domains are shown as rounded squares and values as circles. The model tracks the user utterance 'Cafe Jello Gallery' and predicts 'Cambridge (circle)' as the next slot-value.
  • Figure 2: The overall model structure. Given the user dialogue input (a), the instruction and prompt strategy extract the dialogue state with appropriate prompts (b). Here, we design an optimal DST prompt with a prompt strategy based on Chain-of-Thought, input (a), and an anti-hallucination step. Graph the extracted dialogue state into a VGAE (c) to predict the dialogue state (d) that will come from the user's next input.