Table of Contents
Fetching ...

Authorship Style Transfer with Policy Optimization

Shuai Liu, Shantanu Agarwal, Jonathan May

TL;DR

This work tackles authorship style transfer under low-resource target styles by introducing Astrapop, a two-stage policy-optimization framework that combines supervised fine-tuning on pseudo-parallel data with RL-free policy optimization to directly optimize Toward/Away style objectives. The method uses a neutral paraphraser and a reference model trained on generated data, plus a reward model to estimate style similarity, and then applies PO algorithms (PPO, DPO, CPO) with a novel reward $R = T + A - (LP^{\alpha}-1)$. It evaluates on two tasks: low-resource individual authorship transfer with the Million User Dataset (MUD) and community-level native-language transfer with the ETS Corpus, showing state-of-the-art joint performance with reduced training time, especially for RL-free PO variants. The work discusses limitations in semantic preservation under aggressive style optimization and ethical considerations for misuse, suggesting future work on online PO, alternative information-injection strategies, and safety mitigations.

Abstract

Authorship style transfer aims to rewrite a given text into a specified target while preserving the original meaning in the source. Existing approaches rely on the availability of a large number of target style exemplars for model training. However, these overlook cases where a limited number of target style examples are available. The development of parameter-efficient transfer learning techniques and policy optimization (PO) approaches suggest lightweight PO is a feasible approach to low-resource style transfer. In this work, we propose a simple two-stage tune-and-optimize technique for low-resource textual style transfer. We apply our technique to authorship transfer as well as a larger-data native language style task and in both cases find it outperforms state-of-the-art baseline models.

Authorship Style Transfer with Policy Optimization

TL;DR

This work tackles authorship style transfer under low-resource target styles by introducing Astrapop, a two-stage policy-optimization framework that combines supervised fine-tuning on pseudo-parallel data with RL-free policy optimization to directly optimize Toward/Away style objectives. The method uses a neutral paraphraser and a reference model trained on generated data, plus a reward model to estimate style similarity, and then applies PO algorithms (PPO, DPO, CPO) with a novel reward . It evaluates on two tasks: low-resource individual authorship transfer with the Million User Dataset (MUD) and community-level native-language transfer with the ETS Corpus, showing state-of-the-art joint performance with reduced training time, especially for RL-free PO variants. The work discusses limitations in semantic preservation under aggressive style optimization and ethical considerations for misuse, suggesting future work on online PO, alternative information-injection strategies, and safety mitigations.

Abstract

Authorship style transfer aims to rewrite a given text into a specified target while preserving the original meaning in the source. Existing approaches rely on the availability of a large number of target style exemplars for model training. However, these overlook cases where a limited number of target style examples are available. The development of parameter-efficient transfer learning techniques and policy optimization (PO) approaches suggest lightweight PO is a feasible approach to low-resource style transfer. In this work, we propose a simple two-stage tune-and-optimize technique for low-resource textual style transfer. We apply our technique to authorship transfer as well as a larger-data native language style task and in both cases find it outperforms state-of-the-art baseline models.
Paper Structure (47 sections, 18 equations, 1 figure, 18 tables)