Table of Contents
Fetching ...

Text-to-SQL Task-oriented Dialogue Ontology Construction

Renato Vukovic, Carel van Niekerk, Michael Heck, Benjamin Ruppik, Hsien-Chin Lin, Shutong Feng, Nurul Lubis, Milica Gasic

TL;DR

TeQoDO presents a novel Text-to-SQL framework for constructing task-oriented dialogue ontologies from scratch, leveraging LLMs' code-understanding abilities to iteratively build a SQL-backed ontology that encodes domains, slots, values, intents, and actions. By integrating dialogue state tracking and a notion of dialogue success into the prompt pipeline, TeQoDO produces coherent, update-consistent ontologies that support effective downstream dialogue state tracking and DST adaptation. The approach outperforms transfer-learning baselines on TOD ontology induction and generalizes to large general ontologies (Wikipedia/arXiv), with ablations demonstrating the essential role of modular TOD concepts. The work advances explainable knowledge extraction from LLMs and offers a scalable pathway to automatic, adaptable ontologies across domains and data scales.

Abstract

Large language models (LLMs) are widely used as general-purpose knowledge sources, but they rely on parametric knowledge, limiting explainability and trustworthiness. In task-oriented dialogue (TOD) systems, this separation is explicit, using an external database structured by an explicit ontology to ensure explainability and controllability. However, building such ontologies requires manual labels or supervised training. We introduce TeQoDO: a Text-to-SQL task-oriented Dialogue Ontology construction method. Here, an LLM autonomously builds a TOD ontology from scratch using only its inherent SQL programming capabilities combined with concepts from modular TOD systems provided in the prompt. We show that TeQoDO outperforms transfer learning approaches, and its constructed ontology is competitive on a downstream dialogue state tracking task. Ablation studies demonstrate the key role of modular TOD system concepts. TeQoDO also scales to allow construction of much larger ontologies, which we investigate on a Wikipedia and arXiv dataset. We view this as a step towards broader application of ontologies.

Text-to-SQL Task-oriented Dialogue Ontology Construction

TL;DR

TeQoDO presents a novel Text-to-SQL framework for constructing task-oriented dialogue ontologies from scratch, leveraging LLMs' code-understanding abilities to iteratively build a SQL-backed ontology that encodes domains, slots, values, intents, and actions. By integrating dialogue state tracking and a notion of dialogue success into the prompt pipeline, TeQoDO produces coherent, update-consistent ontologies that support effective downstream dialogue state tracking and DST adaptation. The approach outperforms transfer-learning baselines on TOD ontology induction and generalizes to large general ontologies (Wikipedia/arXiv), with ablations demonstrating the essential role of modular TOD concepts. The work advances explainable knowledge extraction from LLMs and offers a scalable pathway to automatic, adaptable ontologies across domains and data scales.

Abstract

Large language models (LLMs) are widely used as general-purpose knowledge sources, but they rely on parametric knowledge, limiting explainability and trustworthiness. In task-oriented dialogue (TOD) systems, this separation is explicit, using an external database structured by an explicit ontology to ensure explainability and controllability. However, building such ontologies requires manual labels or supervised training. We introduce TeQoDO: a Text-to-SQL task-oriented Dialogue Ontology construction method. Here, an LLM autonomously builds a TOD ontology from scratch using only its inherent SQL programming capabilities combined with concepts from modular TOD systems provided in the prompt. We show that TeQoDO outperforms transfer learning approaches, and its constructed ontology is competitive on a downstream dialogue state tracking task. Ablation studies demonstrate the key role of modular TOD system concepts. TeQoDO also scales to allow construction of much larger ontologies, which we investigate on a Wikipedia and arXiv dataset. We view this as a step towards broader application of ontologies.

Paper Structure

This paper contains 52 sections, 8 figures, 11 tables, 1 algorithm.

Figures (8)

  • Figure 1: Example ontology construction shows the extraction of the domain "Hotels", with slot "Price" and value "cheap". Actions and intents are defined based on the domains and slots.
  • Figure 2: Example TOD ontology.
  • Figure 3: TeQoDO Overview with example DB queries and results.
  • Figure 4: Continuous F1 of different Prompts on MultiWOZ test-set based on LLM explanation.
  • Figure 5: Prompts for TeQoDO steps. db_result_input is the result of the DB queries from the prior step.
  • ...and 3 more figures