Table of Contents
Fetching ...

EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

Yujie Zhang, Weikang Yuan, Zhuoren Jiang, Pengwei Yan

Abstract

Pluralistic alignment is essential for adapting large language models (LLMs) to the diverse preferences of individuals and minority groups. However, existing approaches often mix stable personal traits with episode-specific factors, limiting their ability to generalize across episodes. To address this challenge, we introduce EpiPersona, a framework for explicit persona-episode coupling. EpiPersona first projects noisy preference feedback into a low-dimensional persona space, where similar personas are aggregated into shared discrete codes. This process separates enduring personal characteristics from situational signals without relying on predefined preference dimensions. The inferred persona representation is then coupled with the current episode, enabling episode-aware preference prediction. Extensive experiments show that EpiPersona consistently outperforms the baselines. It achieves notable performance gains in hard episodic-shift scenarios, while remaining effective with sparse preference data.

EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

Abstract

Pluralistic alignment is essential for adapting large language models (LLMs) to the diverse preferences of individuals and minority groups. However, existing approaches often mix stable personal traits with episode-specific factors, limiting their ability to generalize across episodes. To address this challenge, we introduce EpiPersona, a framework for explicit persona-episode coupling. EpiPersona first projects noisy preference feedback into a low-dimensional persona space, where similar personas are aggregated into shared discrete codes. This process separates enduring personal characteristics from situational signals without relying on predefined preference dimensions. The inferred persona representation is then coupled with the current episode, enabling episode-aware preference prediction. Extensive experiments show that EpiPersona consistently outperforms the baselines. It achieves notable performance gains in hard episodic-shift scenarios, while remaining effective with sparse preference data.

Paper Structure

This paper contains 46 sections, 21 equations, 6 figures, 5 tables, 2 algorithms.

Figures (6)

  • Figure 1: We propose EpiPersona, which first maps the preference feedback space $X$ into a persona space $\mathcal{Z}_p$. Based on an individual’s historical preference information, we model the individual’s latent persona $Z_u$. We further introduce two variants of EpiPersona that couple persona $Z_u$ with episode context $e$ for preference prediction: EpiPersona-A, which is tailored to inferring individuals’ episode-specific preferences in a plug-and-play manner, and EpiPersona-B, which is designed for pluralistic reward learning.
  • Figure 2: Overview of the persona projection mapping.
  • Figure 3: Evaluation of EpiPersona under the influence of different factors. (A) Episode similarity: performance under high vs. low similarity, showing smaller drops for EpiPersona under the episode-shift scenario. (B) Number of preference feedback instances: model performance across users with varying amounts of historical feedback, highlighting advantages in sparse scenarios. (C) Window size: effect of the number of observable preference feedback instances on EpiPersona.
  • Figure 4: The distribution of the history samples per user.
  • Figure 5: Model architecture details (parameterized abductive reasoning and VQ-based mapping).
  • ...and 1 more figures