Control Prefixes for Parameter-Efficient Text Generation
Jordan Clive, Kris Cao, Marek Rei
TL;DR
Control Prefixes address parameter-efficient text generation by combining a fixed large pretrained LM with input-conditioned control prefixes learned per attribute, enabling datapoint-level guidance without full fine-tuning. The method optimizes a compact set of prefixes, including shared re-parameterized components across attention classes, to steer generation via input guidance $G$. It achieves state-of-the-art or competitive results on data-to-text benchmarks (e.g., WebNLG, DART), simplifies effectively with SARI/FKGL gains, and attains strong ROUGE scores with superior human evaluations on XSum, all while adding less than 3% parameters. The approach also demonstrates zero-shot transfer capabilities when attribute labels are semantically similar, supported by interpretable prefix organization across layers and attention types, offering practical impact for deployment of scalable, controllable NLG systems.
Abstract
Prefix-tuning is a powerful lightweight technique for adapting a large pre-trained language model to a downstream application. However, it uses the same dataset-level tuned prompt for all examples in the dataset. We extend this idea and propose a dynamic method, Control Prefixes, which allows for the inclusion of conditional input-dependent information, combining the benefits of prompt tuning and controlled generation. The method incorporates attribute-level learnable representations into different layers of a pre-trained transformer, allowing for the generated text to be guided in a particular direction. We provide a systematic evaluation of the technique and apply it to five datasets from the GEM benchmark for natural language generation (NLG). Although the aim is to develop a parameter-efficient model, we show Control Prefixes can even outperform full fine-tuning methods. We present state-of-the-art results on several data-to-text datasets, including WebNLG.
