Table of Contents
Fetching ...

CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking

Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf

TL;DR

This work introduces CORRECTIONLM, a novel correction framework that enables SLMs to self-correct using in-context exemplars without LLM involvement, and achieves results similar to a state-of-the-art LLM at a small fraction of the computation costs.

Abstract

Large language models (LLMs) have demonstrated self-improvement capabilities via feedback and refinement, but current small language models (SLMs) have had limited success in this area. Existing correction approaches often rely on distilling knowledge from LLMs, which imposes significant computation demands. In this work, we introduce CORRECTIONLM, a novel correction framework that enables SLMs to self-correct using in-context exemplars without LLM involvement. Applied to two dialogue state tracking (DST) tasks in low-resource settings, CORRECTIONLM achieves results similar to a state-of-the-art LLM at a small fraction of the computation costs.

CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking

TL;DR

This work introduces CORRECTIONLM, a novel correction framework that enables SLMs to self-correct using in-context exemplars without LLM involvement, and achieves results similar to a state-of-the-art LLM at a small fraction of the computation costs.

Abstract

Large language models (LLMs) have demonstrated self-improvement capabilities via feedback and refinement, but current small language models (SLMs) have had limited success in this area. Existing correction approaches often rely on distilling knowledge from LLMs, which imposes significant computation demands. In this work, we introduce CORRECTIONLM, a novel correction framework that enables SLMs to self-correct using in-context exemplars without LLM involvement. Applied to two dialogue state tracking (DST) tasks in low-resource settings, CORRECTIONLM achieves results similar to a state-of-the-art LLM at a small fraction of the computation costs.

Paper Structure

This paper contains 26 sections, 7 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Illustration of the two-pass approach of CorrectionLM. The first pass involves using the baseline ICL model, which prompts the SLM to predict TLB with retrieved in-context examples of inputs and gold outputs. The second pass prompts a second LM with examples of corrections.
  • Figure 2: Illustration of the CorrectionLM training process. The first step is to prompt the first-pass inference SLM with a few in-context exemplars to produce predictions for each example in the training set. The second step is to finetune an SLM to generate the gold label given correction-augmented in-context exemplars and the target input and initial target prediction from the first step.
  • Figure 3: Cross-domain generalization results on SGD. We denote In-Domain when all of the testing domains are in the training set and denote OOD when all of the testing domains are not in the training set. For all other dialogues, we categorize them as Half OOD. We report DST JGA for all settings.