Controllable Text Generation in the Instruction-Tuning Era
Dhananjay Ashok, Barnabas Poczos
TL;DR
This paper investigates controllable text generation in the era of instruction-tuned LLMs, revealing that prompting-based baselines often outperform traditional controllable methods on stylistic and structural constraints while approaching human performance on many stylistic tasks. It introduces ConGenBench, a diverse benchmark with 17 task datasets and 18 constraint datasets, and proposes a method to automatically generate constraint datasets using in-context learning, removing dependence on curated constraint resources. The study evaluates 9 baselines and methods, demonstrating that prompting strategies are a strong baseline for instruction-tuned models and highlighting the need for research into more challenging constraints and structural control. Collectively, the work provides both practical benchmarks and methodological tools to expand controllable generation research in the instruction-tuning regime, with implications for safer and more aligned AI systems.
Abstract
While most research on controllable text generation has focused on steering base Language Models, the emerging instruction-tuning and prompting paradigm offers an alternate approach to controllability. We compile and release ConGenBench, a testbed of 17 different controllable generation tasks, using a subset of it to benchmark the performance of 9 different baselines and methods on Instruction-tuned Language Models. To our surprise, we find that prompting-based approaches outperform controllable text generation methods on most datasets and tasks, highlighting a need for research on controllable text generation with Instruction-tuned Language Models in specific. Prompt-based approaches match human performance on most stylistic tasks while lagging on structural tasks, foregrounding a need to study more varied constraints and more challenging stylistic tasks. To facilitate such research, we provide an algorithm that uses only a task dataset and a Large Language Model with in-context capabilities to automatically generate a constraint dataset. This method eliminates the fields dependence on pre-curated constraint datasets, hence vastly expanding the range of constraints that can be studied in the future.
