Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
Chris Hokamp, Qun Liu
TL;DR
This paper introduces Grid Beam Search (GBS), a decoding algorithm that enforces user-specified lexical constraints within sequence generation without modifying model parameters. By organizing decoding on a t-by-c grid and distinguishing open/closed constraint states, GBS can handle multi-token and discontinuous constraints while leveraging the underlying model's probabilities. Empirical results in interactive machine translation show large improvements when constraints are provided, and domain adaptation experiments demonstrate meaningful BLEU gains using automatically mined terminology. Overall, GBS offers a flexible, general approach to constraint-aware decoding applicable to MT and other text-generation tasks, with potential for broader adoption and future constraint-aware model developments.
Abstract
We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. The algorithm can be used with any model that generates a sequence $ \mathbf{\hat{y}} = \{y_{0}\ldots y_{T}\} $, by maximizing $ p(\mathbf{y} | \mathbf{x}) = \prod\limits_{t}p(y_{t} | \mathbf{x}; \{y_{0} \ldots y_{t-1}\}) $. Lexical constraints take the form of phrases or words that must be present in the output sequence. This is a very general way to incorporate additional knowledge into a model's output without requiring any modification of the model parameters or training data. We demonstrate the feasibility and flexibility of Lexically Constrained Decoding by conducting experiments on Neural Interactive-Predictive Translation, as well as Domain Adaptation for Neural Machine Translation. Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, GBS can be used to achieve significant gains in performance in domain adaptation scenarios.
