UltraGen: Extremely Fine-grained Controllable Generation via Attribute Reconstruction and Global Preference Optimization
Longfei Yun, Letian Peng, Jingbo Shang
TL;DR
UltraGen tackles the challenge of extremely fine-grained controllable text generation (EFCG) by combining a self-supervised Auto-Reconstruction stage that grounds soft and hard attributes with a Global Preference Optimization stage that models attribute correlations and promotes diverse, coherent constraint combinations via Direct Preference Optimization. The approach is supported by UltraBench, a large, non-synthesized benchmark with two splits and an average of tens of attributes per sample, enabling robust evaluation of CSR and semantic quality. Empirical results show significant improvements in constraint adherence and text quality, with AR stabilizing performance across attribute positions and AR+GPO delivering strong global generalization and favorable trade-offs between CSR and F1/BERTScore. The work provides a scalable framework for high-constraint text generation with practical applications such as travel planning, while acknowledging limitations in hard-constraint richness and cross-constraint coherence for future exploration.
Abstract
Fine granularity is an essential requirement for controllable text generation, which has seen rapid growth with the ability of LLMs. However, existing methods focus mainly on a small set of attributes like 3 to 5, and their performance degrades significantly when the number of attributes increases to the next order of magnitude. To address this challenge, we propose a novel zero-shot approach for extremely fine-grained controllable generation (EFCG), proposing auto-reconstruction (AR) and global preference optimization (GPO). In the AR phase, we leverage LLMs to extract soft attributes (e.g., Emphasis on simplicity and minimalism in design) from raw texts, and combine them with programmatically derived hard attributes (e.g., The text should be between 300 and 400 words) to construct massive (around 45) multi-attribute requirements, which guide the fine-grained text reconstruction process under weak supervision. In the GPO phase, we apply direct preference optimization (DPO) to refine text generation under diverse attribute combinations, enabling efficient exploration of the global combination space. Additionally, we introduce an efficient attribute sampling strategy to identify and correct potentially erroneous attributes, further improving global optimization. Our framework significantly improves the constraint satisfaction rate (CSR) and text quality for EFCG by mitigating position bias and alleviating attention dilution.
