Language Model Sentence Completion with a Parser-Driven Rhetorical Control Method
Joshua Zingale, Jugal Kalita
TL;DR
This paper addresses controlling large language model text generation by enforcing predefined rhetorical relations between spans using a plug-in, parser-driven decoding approach that does not require fine tuning. It combines BLOOM 1.7B as the generation model with the DMRST RST parser to re-rank candidate next tokens according to how well they satisfy a target relation, using a top-p nucleus sampling strategy and a balancing parameter to blend model likelihood with parser guidance. Automatic and human evaluations demonstrate strong adherence to the desired relations with minimal degradation in perplexity, fluency, or reasonableness, including cross-language results for Spanish. The work enables RST guided, downstream RST tree generation in a practical plug-and-play fashion, offering a principled method to steer discourse structure in LLM outputs with modest computational overhead.
Abstract
Controlled text generation (CTG) seeks to guide large language model (LLM) output to produce text that conforms to desired criteria. The current study presents a novel CTG algorithm that enforces adherence toward specific rhetorical relations in an LLM sentence-completion context by a parser-driven decoding scheme that requires no model fine-tuning. The method is validated both with automatic and human evaluation. The code is accessible on GitHub.
