Table of Contents
Fetching ...

PET: Preference Evolution Tracking with LLM-Generated Explainable Distribution

Luyang Zhang, Jialu Wang, Shichao Zhu, Siyuan Peng, Beibei Li, Zhongcun Wang, Guangmou Pan, Yan Li, Yang Song

TL;DR

Preference Evolution Tracking (PET), a framework that reframes the task as inferring a dynamic probability distribution over a stable and interpretable lattice of preference clusters, paving the way for more explainable, fair, and diverse personalization systems.

Abstract

Understanding how user preference evolves over time is a fundamental challenge central to modern digital ecosystems, for which Large Language Models (LLMs) are an increasingly prominent and popular approach due to their ability to comprehend the rich semantic context within behavioral data. A common practice is to use LLMs to predict a user's next action by directly generating a ranked list of preferred items. Although effective for short-term prediction, the end-to-end generation paradigm inherently limits personalization. Its opaque decision-making process obscures holistic user profiling and exacerbates popularity bias. To address these limitations, we propose Preference Evolution Tracking (PET), a framework that reframes the task as inferring a dynamic probability distribution over a stable and interpretable lattice of preference clusters. By applying logit-probing and generative classification techniques, PET infers a user's preference as a probability distribution, enabling transparent preference learning. On public benchmarks (Yelp, MovieLens), PET improves ranking quality by up to 40% in NDCG over direct generation baselines. On a large-scale, real-world dataset from a short-video platform, it excels at ranking long-tail contents, significantly outperforming a SOTA production model by 7 times in the NDCG score. Ultimately, PET transforms the user profile model from direct preference list generation to a transparent distributional preference mapping, paving the way for more explainable, fair, and diverse personalization systems.

PET: Preference Evolution Tracking with LLM-Generated Explainable Distribution

TL;DR

Preference Evolution Tracking (PET), a framework that reframes the task as inferring a dynamic probability distribution over a stable and interpretable lattice of preference clusters, paving the way for more explainable, fair, and diverse personalization systems.

Abstract

Understanding how user preference evolves over time is a fundamental challenge central to modern digital ecosystems, for which Large Language Models (LLMs) are an increasingly prominent and popular approach due to their ability to comprehend the rich semantic context within behavioral data. A common practice is to use LLMs to predict a user's next action by directly generating a ranked list of preferred items. Although effective for short-term prediction, the end-to-end generation paradigm inherently limits personalization. Its opaque decision-making process obscures holistic user profiling and exacerbates popularity bias. To address these limitations, we propose Preference Evolution Tracking (PET), a framework that reframes the task as inferring a dynamic probability distribution over a stable and interpretable lattice of preference clusters. By applying logit-probing and generative classification techniques, PET infers a user's preference as a probability distribution, enabling transparent preference learning. On public benchmarks (Yelp, MovieLens), PET improves ranking quality by up to 40% in NDCG over direct generation baselines. On a large-scale, real-world dataset from a short-video platform, it excels at ranking long-tail contents, significantly outperforming a SOTA production model by 7 times in the NDCG score. Ultimately, PET transforms the user profile model from direct preference list generation to a transparent distributional preference mapping, paving the way for more explainable, fair, and diverse personalization systems.

Paper Structure

This paper contains 30 sections, 1 theorem, 9 equations, 3 figures, 8 tables, 3 algorithms.

Key Result

Lemma 1

With assm:Isotonic_Probing, the ranking produced by PET by sorting its inferred probabilities maximizes the expected score for common order-aware metrics, including NDCG@k, Recall@k, and Precision@k. (See proof at appd:proofs.)

Figures (3)

  • Figure 1: The PET framework pipeline. Left (Training): An LLM is trained on user history to learn preference distributions. Center (Inference): probing methods extracts the predicted preference distribution from the model's internal logits. Right (Application): The transparent distribution is used for downstream tasks: personalized ranking, long-tail discovery, and interpretable user profiling.
  • Figure 2: Comprehensive comparison of different methods on MovieLens dataset across a range of context windows (i.e., 1,3,5, and 8 sessions). Prediction windows: long-term and short-term. Metrics: NDCG@$1,5,10$ and JS-Divergence. We compare our Logit Probing and Generative Classification methods against (1) Direct Generation benchmark trained (PT+SFT) on a Qwen3-8B base model and (2) a SOTA Qwen3-Reranker-8B ranking model. Note that for the Direct Generation baseline, the JS-divergence will not be available. Sample size: 38,434.
  • Figure 3: Evolution of group-level movie genre preferences as probability distribution across four time periods. Dataset: Movielens, 125 users over 4 periods. Method: PET Likelihood-based Probing with PT+SFT on Qwen3-8B. Note: due to limited space, we only show top-10 clusters (genres) here.

Theorems & Definitions (3)

  • Lemma 1: Optimality of PET's Ranking
  • Remark 1: Sub-optimality of Direct Generation
  • proof