Replacing Language Model for Style Transfer
Pengyu Cheng, Ruineng Li
TL;DR
Text style transfer under limited parallel data remains challenging. This paper introduces Replacing Language Model (RLM), which autoregressively generates target-style text by replacing each source token with a semantically equivalent span produced by a non-autoregressive masked LM, and enforces token-level style-content disentanglement via mutual-information objectives. The approach yields strong results on Yelp and Amazon in both automatic metrics (ACC, Ref-BLEU, Self-BLEU, GM) and human judgments, outperforming several baselines in overall transfer quality. The proposed RLM framework offers a flexible, fine-grained alternative to traditional sentence- and word-level methods and holds promise for broader sequence-to-sequence tasks.
Abstract
We introduce replacing language model (RLM), a sequence-to-sequence language modeling framework for text style transfer (TST). Our method autoregressively replaces each token of the source sentence with a text span that has a similar meaning but in the target style. The new span is generated via a non-autoregressive masked language model, which can better preserve the local-contextual meaning of the replaced token. This RLM generation scheme gathers the flexibility of autoregressive models and the accuracy of non-autoregressive models, which bridges the gap between sentence-level and word-level style transfer methods. To control the generation style more precisely, we conduct a token-level style-content disentanglement on the hidden representations of RLM. Empirical results on real-world text datasets demonstrate the effectiveness of RLM compared with other TST baselines. The code is at https://github.com/Linear95/RLM.
