Table of Contents
Fetching ...

PersonalLLM: Tailoring LLMs to Individual Preferences

Thomas P. Zollo, Andrew Wei Tung Siah, Naimeng Ye, Ang Li, Hongseok Namkoong

TL;DR

PersonalLLM introduces a public benchmark to study LLM personalization by pairing open-ended prompts with multiple high-quality responses and simulating diverse user preferences via ensembles of reward models. The dataset enables learning across users under sparse feedback, using a Dirichlet-based sampling of reward-model weights to generate heterogeneous personas. Analyses show genuine preference diversity, semantic/syntactic effects, and reasonable alignment with human opinions, while personalization experiments via in-context learning and meta-learning demonstrate the feasibility and current limitations of idiosyncratic personalization. The work lays a foundation for developing cross-user personalization methods and highlights safety, fairness, and robustness considerations for practical deployment.

Abstract

As LLMs become capable of complex tasks, there is growing potential for personalized interactions tailored to the subtle and idiosyncratic preferences of the user. We present a public benchmark, PersonalLLM, focusing on adapting LLMs to provide maximal benefits for a particular user. Departing from existing alignment benchmarks that implicitly assume uniform preferences, we curate open-ended prompts paired with many high-quality answers over which users would be expected to display heterogeneous latent preferences. Instead of persona-prompting LLMs based on high-level attributes (e.g., user's race or response length), which yields homogeneous preferences relative to humans, we develop a method that can simulate a large user base with diverse preferences from a set of pre-trained reward models. Our dataset and generated personalities offer an innovative testbed for developing personalization algorithms that grapple with continual data sparsity--few relevant feedback from the particular user--by leveraging historical data from other (similar) users. We explore basic in-context learning and meta-learning baselines to illustrate the utility of PersonalLLM and highlight the need for future methodological development. Our dataset is available at https://huggingface.co/datasets/namkoong-lab/PersonalLLM

PersonalLLM: Tailoring LLMs to Individual Preferences

TL;DR

PersonalLLM introduces a public benchmark to study LLM personalization by pairing open-ended prompts with multiple high-quality responses and simulating diverse user preferences via ensembles of reward models. The dataset enables learning across users under sparse feedback, using a Dirichlet-based sampling of reward-model weights to generate heterogeneous personas. Analyses show genuine preference diversity, semantic/syntactic effects, and reasonable alignment with human opinions, while personalization experiments via in-context learning and meta-learning demonstrate the feasibility and current limitations of idiosyncratic personalization. The work lays a foundation for developing cross-user personalization methods and highlights safety, fairness, and robustness considerations for practical deployment.

Abstract

As LLMs become capable of complex tasks, there is growing potential for personalized interactions tailored to the subtle and idiosyncratic preferences of the user. We present a public benchmark, PersonalLLM, focusing on adapting LLMs to provide maximal benefits for a particular user. Departing from existing alignment benchmarks that implicitly assume uniform preferences, we curate open-ended prompts paired with many high-quality answers over which users would be expected to display heterogeneous latent preferences. Instead of persona-prompting LLMs based on high-level attributes (e.g., user's race or response length), which yields homogeneous preferences relative to humans, we develop a method that can simulate a large user base with diverse preferences from a set of pre-trained reward models. Our dataset and generated personalities offer an innovative testbed for developing personalization algorithms that grapple with continual data sparsity--few relevant feedback from the particular user--by leveraging historical data from other (similar) users. We explore basic in-context learning and meta-learning baselines to illustrate the utility of PersonalLLM and highlight the need for future methodological development. Our dataset is available at https://huggingface.co/datasets/namkoong-lab/PersonalLLM
Paper Structure (43 sections, 2 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 43 sections, 2 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Standard LLMs require tedious re-prompting to learn a user’s preferences in each session. PersonalLLM aims to learn a unique user's diverse preferences to maximize long-term satisfaction.
  • Figure 2: In the canonical personalization setting, a dataset of historical users and their interactions is leveraged to personalize interactions for a new user with a limited history. PersonalLLM enables the development of such methods for learning across users.
  • Figure 3: Left: Existing alignment datasets contain prompts paired with multiple responses, where the majority of people are expected to prefer one specific response (e.g., a harmless response). Right: Our dataset consists of prompts paired with many high-quality responses, each favored by different personas. Such a dataset induces diverse preferences in our personal preference models, creating a testbed to build PersonalLLMs.
  • Figure 4: Probing the heterogeneous preferences of our simulated users across the PersonalLLM dataset given different settings of $\alpha$, and comparing to a persona prompting baseline. Top: For a population of simulated users, the percentage of each population's vote share given to the most common winning response for each prompt; higher values indicate more preference diversity. Middle: A histogram showing the number of responses that receive at least one vote from a simulated population for each prompt; diverse preferences cause higher concentration on the right side of each plot. Bottom: Average win rates across the population for the 8 LLMs in our dataset.
  • Figure 5: Analysis of simulated user preferences with respect to prompt and response contents. Left, middle: For each user, we train a regression model to predict winning responses based on either semantic (left) or syntactic (middle) features. For each feature, we show a box plot with the resultant regression coefficient across users. Right: We examine the entropy in population preferences based on keywords in prompts, comparing words we would expect to inspire heterogeneity (e.g., imagine, opinion, poem) to prompts beginning with "who", "when", and "where", which should evoke more objective answers.
  • ...and 1 more figures