Table of Contents
Fetching ...

AI PERSONA: Towards Life-long Personalization of LLMs

Tiannan Wang, Meiling Tao, Ruoyu Fang, Huilin Wang, Shuai Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou

TL;DR

This work formalizes life-long personalization for large language models and presents AI Persona, a lightweight, non-retraining framework that continually adapts to individual user profiles represented as learnable dictionaries. A 5-stage data generation pipeline (seed data, persona synthesis, scene generation, personalized query generation, data filtering) enables realistic, evolving persona data and interactions, while a three-component framework (Historical Session Manager, Tool Executor, Personalized Chatbot) supports continuous adaptation. The authors introduce PersonaBench, a benchmark with 200 diverse personas and thousands of interactions to evaluate long-horizon personalization, reporting that updating the persona every ~3 conversations yields performance close to a Golden Persona upper bound across multiple base LLMs. Experimental results show improved personalized response quality, faster satisfaction, and stable persona alignment without retraining, underscoring the practical impact for scalable, user-centric AI assistants. Limitations include language bias in seed data and the need for multilingual validation in future work.

Abstract

In this work, we introduce the task of life-long personalization of large language models. While recent mainstream efforts in the LLM community mainly focus on scaling data and compute for improved capabilities of LLMs, we argue that it is also very important to enable LLM systems, or language agents, to continuously adapt to the diverse and ever-changing profiles of every distinct user and provide up-to-date personalized assistance. We provide a clear task formulation and introduce a simple, general, effective, and scalable framework for life-long personalization of LLM systems and language agents. To facilitate future research on LLM personalization, we also introduce methods to synthesize realistic benchmarks and robust evaluation metrics. We will release all codes and data for building and benchmarking life-long personalized LLM systems.

AI PERSONA: Towards Life-long Personalization of LLMs

TL;DR

This work formalizes life-long personalization for large language models and presents AI Persona, a lightweight, non-retraining framework that continually adapts to individual user profiles represented as learnable dictionaries. A 5-stage data generation pipeline (seed data, persona synthesis, scene generation, personalized query generation, data filtering) enables realistic, evolving persona data and interactions, while a three-component framework (Historical Session Manager, Tool Executor, Personalized Chatbot) supports continuous adaptation. The authors introduce PersonaBench, a benchmark with 200 diverse personas and thousands of interactions to evaluate long-horizon personalization, reporting that updating the persona every ~3 conversations yields performance close to a Golden Persona upper bound across multiple base LLMs. Experimental results show improved personalized response quality, faster satisfaction, and stable persona alignment without retraining, underscoring the practical impact for scalable, user-centric AI assistants. Limitations include language bias in seed data and the need for multilingual validation in future work.

Abstract

In this work, we introduce the task of life-long personalization of large language models. While recent mainstream efforts in the LLM community mainly focus on scaling data and compute for improved capabilities of LLMs, we argue that it is also very important to enable LLM systems, or language agents, to continuously adapt to the diverse and ever-changing profiles of every distinct user and provide up-to-date personalized assistance. We provide a clear task formulation and introduce a simple, general, effective, and scalable framework for life-long personalization of LLM systems and language agents. To facilitate future research on LLM personalization, we also introduce methods to synthesize realistic benchmarks and robust evaluation metrics. We will release all codes and data for building and benchmarking life-long personalized LLM systems.

Paper Structure

This paper contains 34 sections, 2 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Data generation pipeline for PersonaBench. This pipeline consists of 5 stages: seed data collection, persona synthesis, scene generation, personalized query generation and data filtering and refinement.
  • Figure 2: AI Persona Framework.
  • Figure 3: Average number of utterances required per scene. The blueline represents Persona Learning, the orangeline represents Golden Persona, and the greenline represents No Persona. Lower average utterance counts indicate better performance, as it means the dialogue is more efficient and the model requires fewer turns to satisfy the user.
  • Figure 4: Average winning rate of the pair-wise comparison of Golden Persona and Persona Learning as the scene number increases.
  • Figure 5: Average number of utterances for different model bases and persona settings.