Table of Contents
Fetching ...

A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models

Zhouhang Xie, Junda Wu, Yiran Shen, Yu Xia, Xintong Li, Aaron Chang, Ryan Rossi, Sachin Kumar, Bodhisattwa Prasad Majumder, Jingbo Shang, Prithviraj Ammanabrolu, Julian McAuley

TL;DR

The paper surveys personalized and pluralistic alignment for LLMs, arguing that universal alignment is insufficient and proposing a taxonomy that spans training-time, test-time, and user-modeling approaches. It formalizes the problem with per-user reward functions and discusses how shared parameters, steerable prompts, memory, and decoding-time interventions enable per-user preferences to be captured. The review covers training-time methods (user-specific adapters, heads, mixture-of-experts, and reward-models) and test-time methods (prompting, reward-guided decoding, and logit rectification), along with a discussion of data challenges and evaluation gaps. It also highlights the role of personalized user modeling in enabling and evaluating personalized alignment, and outlines future directions including online continual personalization, handling long complex value statements, and the need for standardized benchmarks to gauge progress in this rapidly evolving field.

Abstract

Personalized preference alignment for large language models (LLMs), the process of tailoring LLMs to individual users' preferences, is an emerging research direction spanning the area of NLP and personalization. In this survey, we present an analysis of works on personalized alignment and modeling for LLMs. We introduce a taxonomy of preference alignment techniques, including training time, inference time, and additionally, user-modeling based methods. We provide analysis and discussion on the strengths and limitations of each group of techniques and then cover evaluation, benchmarks, as well as open problems in the field.

A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models

TL;DR

The paper surveys personalized and pluralistic alignment for LLMs, arguing that universal alignment is insufficient and proposing a taxonomy that spans training-time, test-time, and user-modeling approaches. It formalizes the problem with per-user reward functions and discusses how shared parameters, steerable prompts, memory, and decoding-time interventions enable per-user preferences to be captured. The review covers training-time methods (user-specific adapters, heads, mixture-of-experts, and reward-models) and test-time methods (prompting, reward-guided decoding, and logit rectification), along with a discussion of data challenges and evaluation gaps. It also highlights the role of personalized user modeling in enabling and evaluating personalized alignment, and outlines future directions including online continual personalization, handling long complex value statements, and the need for standardized benchmarks to gauge progress in this rapidly evolving field.

Abstract

Personalized preference alignment for large language models (LLMs), the process of tailoring LLMs to individual users' preferences, is an emerging research direction spanning the area of NLP and personalization. In this survey, we present an analysis of works on personalized alignment and modeling for LLMs. We introduce a taxonomy of preference alignment techniques, including training time, inference time, and additionally, user-modeling based methods. We provide analysis and discussion on the strengths and limitations of each group of techniques and then cover evaluation, benchmarks, as well as open problems in the field.

Paper Structure

This paper contains 20 sections, 2 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: (1a) Overview on personalized preference alignment for LLMs. This includes training (\ref{['sec:training_time_alignment']}) and test-time (\ref{['sec:test_time_alignment']}) methods, leveraging various feedbacks such as verbal feedback and choices. (1b) An over-simplified decision tree for determining the class of method to use for personalized preference alignment.
  • Figure 2: Technique taxonomy on personalized and pluralistic preference alignment.
  • Figure 3: Personalized language model alignment during training time.
  • Figure 4: Personalized language model alignment during test time.