Control-DAG: Constrained Decoding for Non-Autoregressive Directed Acyclic T5 using Weighted Finite State Automata
Jinghong Chen, Weizhe Lin, Jingbiao Mei, Bill Byrne
TL;DR
This work tackles reliable non-autoregressive natural language generation with DA-T5 by introducing Control-DAG, a constrained decoding framework that converts the model-produced DAG into a Weighted Finite State Automaton and enforces lexical, vocabulary, and length constraints. By integrating Hard Lexical Constraints, Vocabulary Constraints, and Length Constraints—and optionally a CBS-style constrained beam search—Control-DAG eliminates OOV errors and ensures specified entities appear, while maintaining fast, DAG-compatible decoding. The approach achieves state-of-the-art non-autoregressive results on Schema Guided Dialogue (SGD) and Data-to-Text (DART) tasks, with zero slot errors and zero neologisms on SGD and strong BLEU/BLEURT gains on both datasets, all while beating AR baselines in speed. These results demonstrate the practical viability of constrained WFSA-based decoding for NAR NLG and highlight the potential of automata-theoretic methods to address longstanding issues in NAR generation.
Abstract
The Directed Acyclic Transformer is a fast non-autoregressive (NAR) model that performs well in Neural Machine Translation. Two issues prevent its application to general Natural Language Generation (NLG) tasks: frequent Out-Of-Vocabulary (OOV) errors and the inability to faithfully generate entity names. We introduce Control-DAG, a constrained decoding algorithm for our Directed Acyclic T5 (DA-T5) model which offers lexical, vocabulary and length control. We show that Control-DAG significantly enhances DA-T5 on the Schema Guided Dialogue and the DART datasets, establishing strong NAR results for Task-Oriented Dialogue and Data-to-Text NLG.
