Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

Jianfei Zhang; Jun Bai; Bei Li; Yanmeng Wang; Rumei Li; Chenghua Lin; Wenge Rong

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

Jianfei Zhang, Jun Bai, Bei Li, Yanmeng Wang, Rumei Li, Chenghua Lin, Wenge Rong

TL;DR

This work tackles the challenge of aligning large language models to individual user preferences efficiently. It introduces a dual-track approach: Contrastive Language–Latent Pretraining (CLaP), which extends decoder-only LLMs with a probabilistic latent variable $z$ via a latent encoder $q(z|x,y)$ and a latent adapter $p(y|x,z)$ to disentangle representation from generation, and Latent Direct Preference Optimization (Latent DPO), which learns a personalized latent encoder $p_{\theta}(z|x)$ using offline responses and latent rewards. By applying DPO at the latent level rather than the full model, the method achieves substantial per-user training-time reductions (80–90%) while delivering alignment quality competitive with LoRA- or P-Tuning-based PEFT baselines. Across IMDB, DailyDialog, and TL;DR summarization tasks, Latent DPO demonstrates strong personalized performance and clear efficiency gains, with additional validation on Llama3-8B showing consistent trends. This work offers a scalable solution for individual preference alignment, enabling large-scale customization without prohibitive computational cost.

Abstract

Aligning Large Language Models (LLMs) with general human preferences has been proved crucial in improving the interaction quality between LLMs and human. However, human values are inherently diverse among different individuals, making it insufficient to align LLMs solely with general preferences. To address this, personalizing LLMs according to individual feedback emerges as a promising solution. Nonetheless, this approach presents challenges in terms of the efficiency of alignment algorithms. In this work, we introduce a flexible paradigm for individual preference alignment. Our method fundamentally improves efficiency by disentangling preference representation from text generation in LLMs. We validate our approach across multiple text generation tasks and demonstrate that it can produce aligned quality as well as or better than PEFT-based methods, while reducing additional training time for each new individual preference by $80\%$ to $90\%$ in comparison with them.

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

TL;DR

via a latent encoder

and a latent adapter

to disentangle representation from generation, and Latent Direct Preference Optimization (Latent DPO), which learns a personalized latent encoder

using offline responses and latent rewards. By applying DPO at the latent level rather than the full model, the method achieves substantial per-user training-time reductions (80–90%) while delivering alignment quality competitive with LoRA- or P-Tuning-based PEFT baselines. Across IMDB, DailyDialog, and TL;DR summarization tasks, Latent DPO demonstrates strong personalized performance and clear efficiency gains, with additional validation on Llama3-8B showing consistent trends. This work offers a scalable solution for individual preference alignment, enabling large-scale customization without prohibitive computational cost.

Abstract

in comparison with them.

Paper Structure (42 sections, 11 equations, 9 figures, 18 tables, 1 algorithm)

This paper contains 42 sections, 11 equations, 9 figures, 18 tables, 1 algorithm.

Introduction
Related Works
Preference Alignment
Variational Auto-Encoders
Methodology
Contrastive Language–Latent Pretraining
Personalization through Latent DPO
Inferring Preference on Latent Values
Applying DPO to Latent Values
Experiments
Tasks and Preferences
Text continuation on IMDB
Dialogue generation on DailyDialog
TL;DR Summarization
Baseline Methods
...and 27 more sections

Figures (9)

Figure 1: Our proposed method aims to offer flexible personalization learning from individual feedback, i.e., automatic individual adaptation in an efficient way.
Figure 2: Our method realizes efficient personalization for LLMs through three steps. Step 1 learns the posterior latent encoder (in green) and the latent adapter to disentangle representation and generation. Step 2 learns the personalized latent encoder (in yellow) from individual feedback. Step 3 steers personalized generation from LLMs in the guidance of personalized representations. Among them, only step 2 involves repetitive training for different individual users, and step 2 only involves computation in small networks, i.e., latent encoders, instead of LLMs.
Figure 3: Illustration of Eq. \ref{['eq_DG_ELBo']}, with condition $x$ omitted.
Figure 4: Illustration of Eq. \ref{['eq_contrastive']}, with condition $x$ omitted.
Figure 5: Additional training time on each new individual preference for different methods.
...and 4 more figures

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

TL;DR

Abstract

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

Authors

TL;DR

Abstract

Table of Contents

Figures (9)