Def-DTS: Deductive Reasoning for Open-domain Dialogue Topic Segmentation
Seungmin Lee, Yongsang Yoo, Minhwa Jung, Min Song
TL;DR
Def-DTS presents a novel open-domain dialogue topic segmentation framework that leverages LLM-based multi-step deductive reasoning. By structuring prompts into bidirectional context extraction, utterance intent classification with a domain-agnostic intent pool, and a deductive topic shift detector, and by using an XML-based I/O format, the approach achieves state-of-the-art performance on TIAGE and Dialseg711 and strong results on other datasets. Ablation and analysis confirm that each component—context, intent labeling, and deductive reasoning—contributes to improved accuracy and more reliable topic-shift detection, while also highlighting challenges in intent following for certain utterance types. The work demonstrates the practical value of LLM reasoning for DTS and points toward auto-labeling and integration with other downstream tasks as promising directions for future research.
Abstract
Dialogue Topic Segmentation (DTS) aims to divide dialogues into coherent segments. DTS plays a crucial role in various NLP downstream tasks, but suffers from chronic problems: data shortage, labeling ambiguity, and incremental complexity of recently proposed solutions. On the other hand, Despite advances in Large Language Models (LLMs) and reasoning strategies, these have rarely been applied to DTS. This paper introduces Def-DTS: Deductive Reasoning for Open-domain Dialogue Topic Segmentation, which utilizes LLM-based multi-step deductive reasoning to enhance DTS performance and enable case study using intermediate result. Our method employs a structured prompting approach for bidirectional context summarization, utterance intent classification, and deductive topic shift detection. In the intent classification process, we propose the generalizable intent list for domain-agnostic dialogue intent classification. Experiments in various dialogue settings demonstrate that Def-DTS consistently outperforms traditional and state-of-the-art approaches, with each subtask contributing to improved performance, particularly in reducing type 2 error. We also explore the potential for autolabeling, emphasizing the importance of LLM reasoning techniques in DTS.
