Text-to-SQL Task-oriented Dialogue Ontology Construction
Renato Vukovic, Carel van Niekerk, Michael Heck, Benjamin Ruppik, Hsien-Chin Lin, Shutong Feng, Nurul Lubis, Milica Gasic
TL;DR
TeQoDO presents a novel Text-to-SQL framework for constructing task-oriented dialogue ontologies from scratch, leveraging LLMs' code-understanding abilities to iteratively build a SQL-backed ontology that encodes domains, slots, values, intents, and actions. By integrating dialogue state tracking and a notion of dialogue success into the prompt pipeline, TeQoDO produces coherent, update-consistent ontologies that support effective downstream dialogue state tracking and DST adaptation. The approach outperforms transfer-learning baselines on TOD ontology induction and generalizes to large general ontologies (Wikipedia/arXiv), with ablations demonstrating the essential role of modular TOD concepts. The work advances explainable knowledge extraction from LLMs and offers a scalable pathway to automatic, adaptable ontologies across domains and data scales.
Abstract
Large language models (LLMs) are widely used as general-purpose knowledge sources, but they rely on parametric knowledge, limiting explainability and trustworthiness. In task-oriented dialogue (TOD) systems, this separation is explicit, using an external database structured by an explicit ontology to ensure explainability and controllability. However, building such ontologies requires manual labels or supervised training. We introduce TeQoDO: a Text-to-SQL task-oriented Dialogue Ontology construction method. Here, an LLM autonomously builds a TOD ontology from scratch using only its inherent SQL programming capabilities combined with concepts from modular TOD systems provided in the prompt. We show that TeQoDO outperforms transfer learning approaches, and its constructed ontology is competitive on a downstream dialogue state tracking task. Ablation studies demonstrate the key role of modular TOD system concepts. TeQoDO also scales to allow construction of much larger ontologies, which we investigate on a Wikipedia and arXiv dataset. We view this as a step towards broader application of ontologies.
