Table of Contents
Fetching ...

Multi-Aspect Controllable Text Generation with Disentangled Counterfactual Augmentation

Yi Liu, Xiangyu Liu, Xiangrong Zhu, Wei Hu

TL;DR

This work tackles multi-aspect controllable text generation (MCTG) under imbalanced attribute correlations that induce stereotyped associations. It introduces MAGIC, a prefix-tuning based framework that learns an attribute latent space augmented with disentangled counterfactual features, enabling balanced training and enhanced inference via target-guided counterfactual augmentation. Through resampling and a set of disentanglement losses, MAGIC mitigates bias across attributes and improves control over sentiment, topic, and detoxification, achieving state-of-the-art results on imbalanced and balanced setups. The approach demonstrates meaningful gains in attribute relevance and text quality, with publicly available code and data to facilitate adoption and further research.

Abstract

Multi-aspect controllable text generation aims to control the generated texts in attributes from multiple aspects (e.g., "positive" from sentiment and "sport" from topic). For ease of obtaining training samples, existing works neglect attribute correlations formed by the intertwining of different attributes. Particularly, the stereotype formed by imbalanced attribute correlations significantly affects multi-aspect control. In this paper, we propose MAGIC, a new multi-aspect controllable text generation method with disentangled counterfactual augmentation. We alleviate the issue of imbalanced attribute correlations during training using counterfactual feature vectors in the attribute latent space by disentanglement. During inference, we enhance attribute correlations by target-guided counterfactual augmentation to further improve multi-aspect control. Experiments show that MAGIC outperforms state-of-the-art baselines in both imbalanced and balanced attribute correlation scenarios. Our source code and data are available at https://github.com/nju-websoft/MAGIC.

Multi-Aspect Controllable Text Generation with Disentangled Counterfactual Augmentation

TL;DR

This work tackles multi-aspect controllable text generation (MCTG) under imbalanced attribute correlations that induce stereotyped associations. It introduces MAGIC, a prefix-tuning based framework that learns an attribute latent space augmented with disentangled counterfactual features, enabling balanced training and enhanced inference via target-guided counterfactual augmentation. Through resampling and a set of disentanglement losses, MAGIC mitigates bias across attributes and improves control over sentiment, topic, and detoxification, achieving state-of-the-art results on imbalanced and balanced setups. The approach demonstrates meaningful gains in attribute relevance and text quality, with publicly available code and data to facilitate adoption and further research.

Abstract

Multi-aspect controllable text generation aims to control the generated texts in attributes from multiple aspects (e.g., "positive" from sentiment and "sport" from topic). For ease of obtaining training samples, existing works neglect attribute correlations formed by the intertwining of different attributes. Particularly, the stereotype formed by imbalanced attribute correlations significantly affects multi-aspect control. In this paper, we propose MAGIC, a new multi-aspect controllable text generation method with disentangled counterfactual augmentation. We alleviate the issue of imbalanced attribute correlations during training using counterfactual feature vectors in the attribute latent space by disentanglement. During inference, we enhance attribute correlations by target-guided counterfactual augmentation to further improve multi-aspect control. Experiments show that MAGIC outperforms state-of-the-art baselines in both imbalanced and balanced attribute correlation scenarios. Our source code and data are available at https://github.com/nju-websoft/MAGIC.
Paper Structure (35 sections, 12 equations, 5 figures, 7 tables)

This paper contains 35 sections, 12 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: The relevance scores of positive and negative sentiment in (a) AGNews and (b) Yelp. (a) The classifiers used for statistics are from discrete2022Gu. (b) The statistical data of Yelp are from tailor2023Yang.
  • Figure 2: Framework of our method. Top part: We use the prefix tuning-based autoencoder structure as the framework and construct the attribute latent space. Bottom left: The vectors with counterfactual attribute features generated by the attribute disentanglement module are assisted in the construction of the attribute latent space. Bottom right:Inference stage with target-guided attribute correlation augmentation to improve multi-aspect control.
  • Figure 3: The attribute disentanglement module. $A_{t}$ and $A_{s}$ denote the explicit and implicit aspects, respectively.
  • Figure 4: Relevance scores of sentiment after changing the control factor of sentiment to the opposite.
  • Figure 5: Effects of attribute correlation imbalance in performance with different attribute combinations.