Table of Contents
Fetching ...

SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style Transfer

Jie Zhao, Ziyu Guan, Cai Xu, Wei Zhao, Yue Jiang

TL;DR

This paper tackles long text style transfer by addressing content preservation and cross-sentence style consistency. It introduces SC2, a framework combining multilayer Joint Style-Content Weighing (JSCW) to disentangle style and content at the token level, a Style Fusion Module to inject target style into content representations, and a denoising non-autoregressive (NAR) decoder to accelerate training. The model optimizes a multi-objective loss with style guidance, content reconstruction, NAR augmentation, and a disentanglement penalty, yielding substantial improvements over strong baselines on a stylized long-text dataset in Chinese and English. Extrinsically, SC2 also acts as an effective data augmenter for legal-domain charge prediction, underscoring its practical value for data-scarce NLP tasks; the approach demonstrates strong gains in content preservation and style control while maintaining fluency and consistency across multiple sentences.

Abstract

Text style transfer (TST) aims to vary the style polarity of text while preserving the semantic content. Although recent advancements have demonstrated remarkable progress in short TST, it remains a relatively straightforward task with limited practical applications. The more comprehensive long TST task presents two challenges: (1) existing methods encounter difficulties in accurately evaluating content attributes in multiple words, leading to content degradation; (2) the conventional vanilla style classifier loss encounters obstacles in maintaining consistent style across multiple generated sentences. In this paper, we propose a novel method SC2, where a multilayer Joint Style-Content Weighed (JSCW) module and a Style Consistency loss are designed to address the two issues. The JSCW simultaneously assesses the amounts of style and content attributes within a token, aiming to acquire a lossless content representation and thereby enhancing content preservation. The multiple JSCW layers further progressively refine content representations. We design a style consistency loss to ensure the generated multiple sentences consistently reflect the target style polarity. Moreover, we incorporate a denoising non-autoregressive decoder to accelerate the training. We conduct plentiful experiments and the results show significant improvements of SC2 over competitive baselines. Our code: https://github.com/jiezhao6/SC2.

SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style Transfer

TL;DR

This paper tackles long text style transfer by addressing content preservation and cross-sentence style consistency. It introduces SC2, a framework combining multilayer Joint Style-Content Weighing (JSCW) to disentangle style and content at the token level, a Style Fusion Module to inject target style into content representations, and a denoising non-autoregressive (NAR) decoder to accelerate training. The model optimizes a multi-objective loss with style guidance, content reconstruction, NAR augmentation, and a disentanglement penalty, yielding substantial improvements over strong baselines on a stylized long-text dataset in Chinese and English. Extrinsically, SC2 also acts as an effective data augmenter for legal-domain charge prediction, underscoring its practical value for data-scarce NLP tasks; the approach demonstrates strong gains in content preservation and style control while maintaining fluency and consistency across multiple sentences.

Abstract

Text style transfer (TST) aims to vary the style polarity of text while preserving the semantic content. Although recent advancements have demonstrated remarkable progress in short TST, it remains a relatively straightforward task with limited practical applications. The more comprehensive long TST task presents two challenges: (1) existing methods encounter difficulties in accurately evaluating content attributes in multiple words, leading to content degradation; (2) the conventional vanilla style classifier loss encounters obstacles in maintaining consistent style across multiple generated sentences. In this paper, we propose a novel method SC2, where a multilayer Joint Style-Content Weighed (JSCW) module and a Style Consistency loss are designed to address the two issues. The JSCW simultaneously assesses the amounts of style and content attributes within a token, aiming to acquire a lossless content representation and thereby enhancing content preservation. The multiple JSCW layers further progressively refine content representations. We design a style consistency loss to ensure the generated multiple sentences consistently reflect the target style polarity. Moreover, we incorporate a denoising non-autoregressive decoder to accelerate the training. We conduct plentiful experiments and the results show significant improvements of SC2 over competitive baselines. Our code: https://github.com/jiezhao6/SC2.
Paper Structure (28 sections, 13 equations, 4 figures, 5 tables)

This paper contains 28 sections, 13 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Comparisons of content learning between existing approaches and the proposed method: (a) evaluating the relevance between text $x$ and its style; (b) evaluating the relevance between source text $x_s$ and target text $x_t$; and (c) joint evaluating the relevance between text $x$ and its style as well as content.
  • Figure 2: The framework of SC2. We show an example of transferring $x$ (long source text) to $y$ (long generated text). Dashed lines with arrows indicate that these processes occur only during the training phase.
  • Figure 3: Generated texts with the style of JY. The number preceding each sentence in the generated texts corresponds to the respective sentence in the source text in terms of semantics. Underlined sentences or phrases denote inserted contents tailored to match the target style. We use the corresponding colors of texts between the source and generated texts to emphasize the rewritten content.
  • Figure 4: Visualization of Chinese and English test dataset on original space (blue dots), content space (red dots), and fused space (green dots) using t-SNE.