Differentiable Scheduled Sampling for Credit Assignment

Kartik Goyal; Chris Dyer; Taylor Berg-Kirkpatrick

Differentiable Scheduled Sampling for Credit Assignment

Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick

TL;DR

This work tackles exposure bias in seq2seq training by introducing differentiable relaxations of greedy decoding, enabling continuous backpropagation through earlier decoding decisions. It introduces soft-argmax and a Gumbel-based reparameterization for sample-based training, forming differentiable relaxed decoders within scheduled sampling. Empirical results on German-English MT and German NER show consistent improvements over cross-entropy and conventional scheduled sampling, highlighting improved credit assignment and potentially lower gradient variance. The approach maintains training efficiency comparable to standard seq2seq training and offers a scalable path for more informative training signals in sequence prediction tasks.

Abstract

We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding for sequence-to-sequence (seq2seq) models. By incorporating this approximation into the scheduled sampling training procedure (Bengio et al., 2015)--a well-known technique for correcting exposure bias--we introduce a new training objective that is continuous and differentiable everywhere and that can provide informative gradients near points where previous decoding decisions change their value. In addition, by using a related approximation, we demonstrate a similar approach to sampled-based training. Finally, we show that our approach outperforms cross-entropy training and scheduled sampling procedures in two sequence prediction tasks: named entity recognition and machine translation.

Differentiable Scheduled Sampling for Credit Assignment

TL;DR

Abstract

Differentiable Scheduled Sampling for Credit Assignment

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)