PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization
Christopher Clarke, Yuzhao Heng, Lingjia Tang, Jason Mars
TL;DR
PEFT-U addresses the need to personalize large language models by introducing a large-scale user-centric benchmark with 13+ tasks across Hate+Abuse, sentiment, and humor, involving over 15k users. It evaluates both non-parametric prompting and parameter-efficient fine-tuning within a modular multi-user framework on Flan-T5, illustrating trade-offs between personalization quality and compute efficiency. The data collection treats annotators as individual users and enforces low inter-annotator agreement ($\alpha$ ≤ 0.5) to surface diverse perspectives, with Adapters delivering the strongest overall gains across most tasks while maintaining parameter efficiency. The benchmark provides a practical resource for developing scalable, user-aware NLP systems and points to future directions in personalized, modular LLM design under realistic compute constraints.
Abstract
The recent emergence of Large Language Models (LLMs) has heralded a new era of human-AI interaction. These sophisticated models, exemplified by Chat-GPT and its successors, have exhibited remarkable capabilities in language understanding. However, as these LLMs have undergone exponential growth, a crucial dimension that remains understudied is the personalization of these models. Large foundation models such as GPT-3 etc. focus on creating a universal model that serves a broad range of tasks and users. This approach emphasizes the model's generalization capabilities, treating users as a collective rather than as distinct individuals. While practical for many common applications, this one-size-fits-all approach often fails to address the rich tapestry of human diversity and individual needs. To explore this issue we introduce the PEFT-U Benchmark: a new dataset for building and evaluating NLP models for user personalization. \datasetname{} consists of a series of user-centered tasks containing diverse and individualized expressions where the preferences of users can potentially differ for the same input. Using PEFT-U, we explore the challenge of efficiently personalizing LLMs to accommodate user-specific preferences in the context of diverse user-centered tasks.
