Response Enhanced Semi-supervised Dialogue Query Generation

Jianheng Huang; Ante Wang; Linfeng Gao; Linfeng Song; Jinsong Su

Response Enhanced Semi-supervised Dialogue Query Generation

Jianheng Huang, Ante Wang, Linfeng Gao, Linfeng Song, Jinsong Su

TL;DR

This work tackles the problem of generating effective search queries from dialogue histories to support knowledge-grounded dialogue systems. It introduces SemiDQG, a three-stage semi-supervised framework that uses a response-augmented query producer (RA) to guide a standard query producer (QP); Stage 2 applies similarity-based RA query selection to create high-quality pseudo instances from unlabeled data, and Stage 3 employs RA-guided reinforcement learning to provide fine-grained signals for QP refinement. Empirical results on cross-domain and low-resource benchmarks show that SemiDQG outperforms ChatGPT and competitive baselines, demonstrating strong data efficiency and domain robustness. The work highlights the value of leveraging response information and carefully curated pseudo-labels to improve knowledge-seeking dialogue components, and it provides code to facilitate reproducibility and further research.

Abstract

Leveraging vast and continually updated knowledge from the Internet has been considered an important ability for a dialogue system. Therefore, the dialogue query generation task is proposed for generating search queries from dialogue histories, which will be submitted to a search engine for retrieving relevant websites on the Internet. In this regard, previous efforts were devoted to collecting conversations with annotated queries and training a query producer (QP) via standard supervised learning. However, these studies still face the challenges of data scarcity and domain adaptation. To address these issues, in this paper, we propose a semi-supervised learning framework -- SemiDQG, to improve model performance with unlabeled conversations. Based on the observation that the search query is typically related to the topic of dialogue response, we train a response-augmented query producer (RA) to provide rich and effective training signals for QP. We first apply a similarity-based query selection strategy to select high-quality RA-generated pseudo queries, which are used to construct pseudo instances for training QP and RA. Then, we adopt the REINFORCE algorithm to further enhance QP, with RA-provided rewards as fine-grained training signals. Experimental results and in-depth analysis of three benchmarks show the effectiveness of our framework in cross-domain and low-resource scenarios. Particularly, SemiDQG significantly surpasses ChatGPT and competitive baselines. Our code is available at \url{https://github.com/DeepLearnXMU/SemiDQG}.

Response Enhanced Semi-supervised Dialogue Query Generation

TL;DR

Abstract

Paper Structure (26 sections, 4 equations, 4 figures, 6 tables)

This paper contains 26 sections, 4 equations, 4 figures, 6 tables.

Introduction
Related Work
Search Query Generation
Semi-supervised Learning
Our Framework
Stage 1: Train QP and RA with Supervised Learning
Stage 2: Semi-supervised Learning with Similarity-based Query Selection
Stage 3: RA-guided Reinforcement Learning
Experiments
Setup
Datasets
Evaluation Metrics
Baselines
Implementation Details
Development Results
...and 11 more sections

Figures (4)

Figure 1: Our proposed Semi-supervised Dialogue Query Generation (SemiDQG) framework. In Stage 1, we train QP and RA via standard supervised training on labeled data (not shown for clarity). In Stage 2, for each unlabeled conversation, we use RA to generate its pseudo queries $\bar{q}$. We only keep the query whose similarity score $s(\bar{q})$ exceeds a given threshold $\alpha$ to construct a pseudo instance. We use these high-quality pseudo instances to train QP and RA. In Stage 3, QP is further enhanced using RA-guided reinforcement learning.
Figure 2: Effect of $\alpha$ on Unigram F1 for development sets of WoW and KdConv in Stage 2.
Figure 3: Results on development sets of WoI and KdConv, with different $N_c$ for probability-based and rank-based rewards.
Figure 4: Unigram F1 test results on WoI in the low-resource scenario.

Response Enhanced Semi-supervised Dialogue Query Generation

TL;DR

Abstract

Response Enhanced Semi-supervised Dialogue Query Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)