Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer in Prompt Tuning
Kaige Xie, Tong Yu, Haoliang Wang, Junda Wu, Handong Zhao, Ruiyi Zhang, Kanak Mahadik, Ani Nenkova, Mark Riedl
TL;DR
The paper tackles the challenge of few-shot dialogue summarization by transferring knowledge from dialogue state tracking (DST) using Skeleton-Assisted Prompt Transfer (SAPT). SAPT introduces skeleton generation as extra supervision to create an intermediate, task-specific bridge between DST and summarization, and uses perturbation-based probes to automatically extract skeletons that preserve model capability. Empirical results on two TODSum/SPNet-style benchmarks show that SAPT variants consistently outperform a strong prompt-transfer baseline, with SAPT [DST+Summ] achieving the best ROUGE scores and human judgments. The approach offers a parameter-efficient, cross-task-transfer method with robust ablations, highlighting the value of skeleton-driven intermediate supervision for cross-task knowledge transfer in dialogue systems.
Abstract
In real-world scenarios, labeled samples for dialogue summarization are usually limited (i.e., few-shot) due to high annotation costs for high-quality dialogue summaries. To efficiently learn from few-shot samples, previous works have utilized massive annotated data from other downstream tasks and then performed prompt transfer in prompt tuning so as to enable cross-task knowledge transfer. However, existing general-purpose prompt transfer techniques lack consideration for dialogue-specific information. In this paper, we focus on improving the prompt transfer from dialogue state tracking to dialogue summarization and propose Skeleton-Assisted Prompt Transfer (SAPT), which leverages skeleton generation as extra supervision that functions as a medium connecting the distinct source and target task and resulting in the model's better consumption of dialogue state information. To automatically extract dialogue skeletons as supervised training data for skeleton generation, we design a novel approach with perturbation-based probes requiring neither annotation effort nor domain knowledge. Training the model on such skeletons can also help preserve model capability during prompt transfer. Our method significantly outperforms existing baselines. In-depth analyses demonstrate the effectiveness of our method in facilitating cross-task knowledge transfer in few-shot dialogue summarization.
