Table of Contents
Fetching ...

Composable Text Controls in Latent Space with ODEs

Guangyi Liu, Zeyu Feng, Yuan Gao, Zichao Yang, Xiaodan Liang, Junwei Bao, Xiaodong He, Shuguang Cui, Zhen Li, Zhiting Hu

TL;DR

LatentOps introduces a latent-space framework for composable text control using energy-based models and an efficient ODE-based sampler. By connecting a pretrained LM to a low-dimensional latent space via parameter-efficient adaptation, it enables arbitrary attribute composition (e.g., sentiment, tense, formality, keywords) without extensive fine-tuning or sequence-space search. The approach shows improved generation/editing quality and substantial runtime efficiency over baselines, demonstrated on Yelp and Amazon data. This latent-space paradigm offers scalable, flexible, and controllable text manipulation suitable for practical applications in generation, editing, and data augmentation.

Abstract

Real-world text applications often involve composing a wide range of text control operations, such as editing the text w.r.t. an attribute, manipulating keywords and structure, and generating new text of desired properties. Prior work typically learns/finetunes a language model (LM) to perform individual or specific subsets of operations. Recent research has studied combining operations in a plug-and-play manner, often with costly search or optimization in the complex sequence space. This paper proposes a new efficient approach for composable text operations in the compact latent space of text. The low-dimensionality and differentiability of the text latent vector allow us to develop an efficient sampler based on ordinary differential equations (ODEs) given arbitrary plug-in operators (e.g., attribute classifiers). By connecting pretrained LMs (e.g., GPT2) to the latent space through efficient adaption, we then decode the sampled vectors into desired text sequences. The flexible approach permits diverse control operators (sentiment, tense, formality, keywords, etc.) acquired using any relevant data from different domains. Experiments show that composing those operators within our approach manages to generate or edit high-quality text, substantially improving over previous methods in terms of generation quality and efficiency.

Composable Text Controls in Latent Space with ODEs

TL;DR

LatentOps introduces a latent-space framework for composable text control using energy-based models and an efficient ODE-based sampler. By connecting a pretrained LM to a low-dimensional latent space via parameter-efficient adaptation, it enables arbitrary attribute composition (e.g., sentiment, tense, formality, keywords) without extensive fine-tuning or sequence-space search. The approach shows improved generation/editing quality and substantial runtime efficiency over baselines, demonstrated on Yelp and Amazon data. This latent-space paradigm offers scalable, flexible, and controllable text manipulation suitable for practical applications in generation, editing, and data augmentation.

Abstract

Real-world text applications often involve composing a wide range of text control operations, such as editing the text w.r.t. an attribute, manipulating keywords and structure, and generating new text of desired properties. Prior work typically learns/finetunes a language model (LM) to perform individual or specific subsets of operations. Recent research has studied combining operations in a plug-and-play manner, often with costly search or optimization in the complex sequence space. This paper proposes a new efficient approach for composable text operations in the compact latent space of text. The low-dimensionality and differentiability of the text latent vector allow us to develop an efficient sampler based on ordinary differential equations (ODEs) given arbitrary plug-in operators (e.g., attribute classifiers). By connecting pretrained LMs (e.g., GPT2) to the latent space through efficient adaption, we then decode the sampled vectors into desired text sequences. The flexible approach permits diverse control operators (sentiment, tense, formality, keywords, etc.) acquired using any relevant data from different domains. Experiments show that composing those operators within our approach manages to generate or edit high-quality text, substantially improving over previous methods in terms of generation quality and efficiency.
Paper Structure (54 sections, 15 equations, 5 figures, 22 tables)

This paper contains 54 sections, 15 equations, 5 figures, 22 tables.

Figures (5)

  • Figure 1: Examples of different composition of text operations, such as editing a text in terms of different attributes sequentially (top) or at the same time (middle), or generating a new text of target properties (bottom). The proposed LatentOps enables a single LM (e.g., an adapted GPT-2) to perform arbitrary text operation composition in the latent space.
  • Figure 2: Overview of LatentOps. (Left): We equip pretrained LMs (e.g., GPT-2) with the compact continuous latent space through parameter-efficient adaptation (§\ref{['sec:vae_training']}). (Right): One could plug in arbitrary operators (e.g., attribute classifiers) to obtain the latent-space EBM (§\ref{['sec:latent_ebms']}). We then sample desired latent vectors efficiently by solving the ODE which works backwards through the diffusion process from time $t=T$ to $0$. The resulting sample $\bm{z}(0)$ is fed to the decoder (adapted GPT-2) to generate the desired text sequence.
  • Figure 3: The trend of change of accuracy and input-BLEU as $N$ increases. The digit below each data point represents the corresponding $N$.
  • Figure 7: Automatic evaluation results towards to different $N$ on Yelp review dataset. We mark the best bold and the second best underline.
  • Figure 13: Examples of generation with compositional attributes with keywords (expectation and accommodate).We mark the spans that conform to desired attributes in blue.