Table of Contents
Fetching ...

Topic-to-essay generation with knowledge-based content selection

Jieyong Wang, Chunyao Song, Yihao Wu

TL;DR

This work tackles Topic-to-essay generation (TEG), where fluent, novel paragraphs must align with a small set of topics. It proposes GCS-IPT, a GENIUS-based encoder-decoder with a Copy mechanism and a Content Selection module, trained via an Improved Prefix-Tuning strategy to preserve knowledge while adapting to varying topic counts. A new NAES Chinese TEG dataset is released, and extensive experiments on ZHIHU, ESSAY, and NAES show substantial gains in text diversity (DIST-2) and Novelty while maintaining strong topic-consistency, with BLEU remaining competitive. Overall, the approach enhances diversity and robustness in Chinese TEG by integrating content-aware copying with topic-aware prefixes, enabling coherent, varied outputs without sacrificing topical alignment.

Abstract

The topic-to-essay generation task is a challenging natural language generation task that aims to generate paragraph-level text with high semantic coherence based on a given set of topic words. Previous work has focused on the introduction of external knowledge, ignoring the insufficient generated text diversity. In order to improve the generation diversity, we propose a novel copy mechanism model with a content selection module that integrates rich semantic knowledge from the language model into the decoder. Furthermore, we introduce the improved prefix tuning method to train the model, enabling it to adapt to varying input complexities. In addition, we have contributed a new Chinese dataset for TEG tasks. Experimental results demonstrate that the proposed model can improve the generated text diversity by 35\% to 59\% compared to the state-of-the-art method, while maintaining a high level of topic consistency.

Topic-to-essay generation with knowledge-based content selection

TL;DR

This work tackles Topic-to-essay generation (TEG), where fluent, novel paragraphs must align with a small set of topics. It proposes GCS-IPT, a GENIUS-based encoder-decoder with a Copy mechanism and a Content Selection module, trained via an Improved Prefix-Tuning strategy to preserve knowledge while adapting to varying topic counts. A new NAES Chinese TEG dataset is released, and extensive experiments on ZHIHU, ESSAY, and NAES show substantial gains in text diversity (DIST-2) and Novelty while maintaining strong topic-consistency, with BLEU remaining competitive. Overall, the approach enhances diversity and robustness in Chinese TEG by integrating content-aware copying with topic-aware prefixes, enabling coherent, varied outputs without sacrificing topical alignment.

Abstract

The topic-to-essay generation task is a challenging natural language generation task that aims to generate paragraph-level text with high semantic coherence based on a given set of topic words. Previous work has focused on the introduction of external knowledge, ignoring the insufficient generated text diversity. In order to improve the generation diversity, we propose a novel copy mechanism model with a content selection module that integrates rich semantic knowledge from the language model into the decoder. Furthermore, we introduce the improved prefix tuning method to train the model, enabling it to adapt to varying input complexities. In addition, we have contributed a new Chinese dataset for TEG tasks. Experimental results demonstrate that the proposed model can improve the generated text diversity by 35\% to 59\% compared to the state-of-the-art method, while maintaining a high level of topic consistency.
Paper Structure (12 sections, 8 equations, 2 figures, 2 tables)

This paper contains 12 sections, 8 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: A example for essay generation with given topics.
  • Figure 2: Overview of our proposed model for topic-to-essay generation task.