Table of Contents
Fetching ...

UltraGen: Extremely Fine-grained Controllable Generation via Attribute Reconstruction and Global Preference Optimization

Longfei Yun, Letian Peng, Jingbo Shang

TL;DR

UltraGen tackles the challenge of extremely fine-grained controllable text generation (EFCG) by combining a self-supervised Auto-Reconstruction stage that grounds soft and hard attributes with a Global Preference Optimization stage that models attribute correlations and promotes diverse, coherent constraint combinations via Direct Preference Optimization. The approach is supported by UltraBench, a large, non-synthesized benchmark with two splits and an average of tens of attributes per sample, enabling robust evaluation of CSR and semantic quality. Empirical results show significant improvements in constraint adherence and text quality, with AR stabilizing performance across attribute positions and AR+GPO delivering strong global generalization and favorable trade-offs between CSR and F1/BERTScore. The work provides a scalable framework for high-constraint text generation with practical applications such as travel planning, while acknowledging limitations in hard-constraint richness and cross-constraint coherence for future exploration.

Abstract

Fine granularity is an essential requirement for controllable text generation, which has seen rapid growth with the ability of LLMs. However, existing methods focus mainly on a small set of attributes like 3 to 5, and their performance degrades significantly when the number of attributes increases to the next order of magnitude. To address this challenge, we propose a novel zero-shot approach for extremely fine-grained controllable generation (EFCG), proposing auto-reconstruction (AR) and global preference optimization (GPO). In the AR phase, we leverage LLMs to extract soft attributes (e.g., Emphasis on simplicity and minimalism in design) from raw texts, and combine them with programmatically derived hard attributes (e.g., The text should be between 300 and 400 words) to construct massive (around 45) multi-attribute requirements, which guide the fine-grained text reconstruction process under weak supervision. In the GPO phase, we apply direct preference optimization (DPO) to refine text generation under diverse attribute combinations, enabling efficient exploration of the global combination space. Additionally, we introduce an efficient attribute sampling strategy to identify and correct potentially erroneous attributes, further improving global optimization. Our framework significantly improves the constraint satisfaction rate (CSR) and text quality for EFCG by mitigating position bias and alleviating attention dilution.

UltraGen: Extremely Fine-grained Controllable Generation via Attribute Reconstruction and Global Preference Optimization

TL;DR

UltraGen tackles the challenge of extremely fine-grained controllable text generation (EFCG) by combining a self-supervised Auto-Reconstruction stage that grounds soft and hard attributes with a Global Preference Optimization stage that models attribute correlations and promotes diverse, coherent constraint combinations via Direct Preference Optimization. The approach is supported by UltraBench, a large, non-synthesized benchmark with two splits and an average of tens of attributes per sample, enabling robust evaluation of CSR and semantic quality. Empirical results show significant improvements in constraint adherence and text quality, with AR stabilizing performance across attribute positions and AR+GPO delivering strong global generalization and favorable trade-offs between CSR and F1/BERTScore. The work provides a scalable framework for high-constraint text generation with practical applications such as travel planning, while acknowledging limitations in hard-constraint richness and cross-constraint coherence for future exploration.

Abstract

Fine granularity is an essential requirement for controllable text generation, which has seen rapid growth with the ability of LLMs. However, existing methods focus mainly on a small set of attributes like 3 to 5, and their performance degrades significantly when the number of attributes increases to the next order of magnitude. To address this challenge, we propose a novel zero-shot approach for extremely fine-grained controllable generation (EFCG), proposing auto-reconstruction (AR) and global preference optimization (GPO). In the AR phase, we leverage LLMs to extract soft attributes (e.g., Emphasis on simplicity and minimalism in design) from raw texts, and combine them with programmatically derived hard attributes (e.g., The text should be between 300 and 400 words) to construct massive (around 45) multi-attribute requirements, which guide the fine-grained text reconstruction process under weak supervision. In the GPO phase, we apply direct preference optimization (DPO) to refine text generation under diverse attribute combinations, enabling efficient exploration of the global combination space. Additionally, we introduce an efficient attribute sampling strategy to identify and correct potentially erroneous attributes, further improving global optimization. Our framework significantly improves the constraint satisfaction rate (CSR) and text quality for EFCG by mitigating position bias and alleviating attention dilution.

Paper Structure

This paper contains 51 sections, 4 equations, 9 figures, 13 tables.

Figures (9)

  • Figure 1: Constraint Satisfaction Rate (CSR) across different numbers of attributes for GPT-4o and LLaMA-3.1-8B-Instruct.
  • Figure 2: The whole pipeline of our two-stage UltraGen framework. The auto-reconstruction stage constructs a large-scale dataset by extracting soft and hard attributes from web corpora and then reconstructing the raw text. The global preference optimization stage applies DPO with attribute correlation modeling and diversity selection to enhance multi-attribute generalization over a global corpus.
  • Figure 3: Score degradation as the position of hard attributes shifts in Llama-3.1-8B-Instruct and Qwen2-7B-Instruct, showing a consistent performance drop.
  • Figure 4: Comparison of average attributes across datasets.
  • Figure 5: The Trade-off between F1 score and CSR. While BERTScore tends to improve with more attributes, CSR declines
  • ...and 4 more figures