Table of Contents
Fetching ...

Advancing Precise Outline-Conditioned Text Generation with Task Duality and Explicit Outline Control

Yunzhe Li, Qian Chen, Weixiang Yan, Wen Wang, Qinglin Zhang, Hari Sundaram

TL;DR

This work defines Precise Outline-conditioned Generation, a task that generates long-form text based on sentence-level outlines to improve controllability, faithfulness, and structure. It introduces two datasets, WPOG and CDM, and establishes evaluation metrics including DV, PD, and CD to quantify outline utilization. The authors propose Explicit Outline Utilization Control (OC) and Unified Dual-task Learning (Dual), plus a combined OC+Dual approach, to alleviate imbalanced outline usage during generation in both fine-tuning and zero-shot settings. Experiments with BART, GPT-2, ChatGPT, and Vicuna show that OC and Dual improve alignment with outlines and overall text quality, with notable gains in structure-related metrics and human judgments, highlighting practical implications for controllable long-form generation.

Abstract

Existing works on outline-conditioned text generation typically aim to generate text using provided outlines as rough sketches, such as keywords and phrases. However, these approaches make it challenging to control the quality of text generation and assess consistency between outlines and generated texts due to lack of clarity and rationality of the rough outlines. In this paper, we introduce a novel text generation task called Precise Outline-conditioned Generation, which requires generating stories based on specific, sentence-level outlines. To facilitate research on this task, we construct two new datasets, WPOG and CDM. We provide strong baselines based on fine-tuning models such as BART and GPT-2, and evaluating zero-shot performance of models such as ChatGPT and Vicuna. Furthermore, we identify an issue of imbalanced utilization of the outline information in the precise outline-conditioned generation, which is ubiquitously observed across fine-tuned models and zero-shot inference models. To address this issue, we propose an explicit outline utilization control approach and a novel framework that leverages the task duality between summarization and generation. Experimental results show that the proposed approaches effectively alleviate the issue of imbalanced outline utilization and enhance the quality of precise outline-conditioned text generation for both fine-tuning and zero-shot settings.

Advancing Precise Outline-Conditioned Text Generation with Task Duality and Explicit Outline Control

TL;DR

This work defines Precise Outline-conditioned Generation, a task that generates long-form text based on sentence-level outlines to improve controllability, faithfulness, and structure. It introduces two datasets, WPOG and CDM, and establishes evaluation metrics including DV, PD, and CD to quantify outline utilization. The authors propose Explicit Outline Utilization Control (OC) and Unified Dual-task Learning (Dual), plus a combined OC+Dual approach, to alleviate imbalanced outline usage during generation in both fine-tuning and zero-shot settings. Experiments with BART, GPT-2, ChatGPT, and Vicuna show that OC and Dual improve alignment with outlines and overall text quality, with notable gains in structure-related metrics and human judgments, highlighting practical implications for controllable long-form generation.

Abstract

Existing works on outline-conditioned text generation typically aim to generate text using provided outlines as rough sketches, such as keywords and phrases. However, these approaches make it challenging to control the quality of text generation and assess consistency between outlines and generated texts due to lack of clarity and rationality of the rough outlines. In this paper, we introduce a novel text generation task called Precise Outline-conditioned Generation, which requires generating stories based on specific, sentence-level outlines. To facilitate research on this task, we construct two new datasets, WPOG and CDM. We provide strong baselines based on fine-tuning models such as BART and GPT-2, and evaluating zero-shot performance of models such as ChatGPT and Vicuna. Furthermore, we identify an issue of imbalanced utilization of the outline information in the precise outline-conditioned generation, which is ubiquitously observed across fine-tuned models and zero-shot inference models. To address this issue, we propose an explicit outline utilization control approach and a novel framework that leverages the task duality between summarization and generation. Experimental results show that the proposed approaches effectively alleviate the issue of imbalanced outline utilization and enhance the quality of precise outline-conditioned text generation for both fine-tuning and zero-shot settings.
Paper Structure (41 sections, 6 equations, 9 figures, 6 tables)

This paper contains 41 sections, 6 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: The comparison between different forms of outlines. Given the prompt (e.g., title), the outline could be formulated as a) rough outlines: a list of keywords or phrases, and b) precise outlines: salient sentence-level statements.
  • Figure 2: The proposed Explicit Outline Utilization Control method versus the baseline approach. We omit the prompt $\mathbf{x}$ for simplicity.
  • Figure 3: The impact of different partition methods on the performance of explicit outline utilization control (OC) on CDM.
  • Figure 4: The human evaluation on overall score and detailed performance of the four methods.
  • Figure 5: Comparison among human evaluation and automatic metrics.
  • ...and 4 more figures