Table of Contents
Fetching ...

RPM: Reasoning-Level Personalization for Black-Box Large Language Models

Jieyong Kim, Tongyoung Kim, Soojin Yoon, Jaehyung Kim, Dongha Lee

TL;DR

RPM introduces reasoning-level personalization for black-box LLMs by constructing structured user factors from history, generating personalized reasoning paths, and using feature-based retrieval to guide inference. The framework yields state-of-the-art personalization across four tasks, while enhancing interpretability through explicit reasoning grounded in user features and factors. Extensive experiments show that reasoning-grounded prompts and memory-based retrieval outperform traditional response-level methods, with robust results across models and tasks and reasonable computational overhead. RPM thus offers a practical, interpretable, and scalable direction for tailoring black-box LLMs to individual user behavior.

Abstract

While black-box large language models are widely deployed, they produce generic outputs that overlook individual user preferences. Current personalization methods are fundamentally limited to response-level personalization; they only match final outputs, failing to model the underlying reasoning that connects user behavior to responses. To address this, this work introduces reasoning-level personalization as a new paradigm and proposes RPM, the first systematic framework designed to guide the model's reasoning process using structured rationales constructed from patterns in a user's behavior. RPM constructs a structured model of user behavior-built from response-influential features and statistical factors-to create personalized reasoning paths and retrieve beneficial examples for guiding inference through a feature-based retrieval mechanism. Extensive experiments across four diverse tasks demonstrate that RPM consistently outperforms existing response-level methods while simultaneously enhancing both personalization performance and interpretability, providing a promising direction for black-box LLM personalization.

RPM: Reasoning-Level Personalization for Black-Box Large Language Models

TL;DR

RPM introduces reasoning-level personalization for black-box LLMs by constructing structured user factors from history, generating personalized reasoning paths, and using feature-based retrieval to guide inference. The framework yields state-of-the-art personalization across four tasks, while enhancing interpretability through explicit reasoning grounded in user features and factors. Extensive experiments show that reasoning-grounded prompts and memory-based retrieval outperform traditional response-level methods, with robust results across models and tasks and reasonable computational overhead. RPM thus offers a practical, interpretable, and scalable direction for tailoring black-box LLMs to individual user behavior.

Abstract

While black-box large language models are widely deployed, they produce generic outputs that overlook individual user preferences. Current personalization methods are fundamentally limited to response-level personalization; they only match final outputs, failing to model the underlying reasoning that connects user behavior to responses. To address this, this work introduces reasoning-level personalization as a new paradigm and proposes RPM, the first systematic framework designed to guide the model's reasoning process using structured rationales constructed from patterns in a user's behavior. RPM constructs a structured model of user behavior-built from response-influential features and statistical factors-to create personalized reasoning paths and retrieve beneficial examples for guiding inference through a feature-based retrieval mechanism. Extensive experiments across four diverse tasks demonstrate that RPM consistently outperforms existing response-level methods while simultaneously enhancing both personalization performance and interpretability, providing a promising direction for black-box LLM personalization.

Paper Structure

This paper contains 38 sections, 5 equations, 6 figures, 22 tables.

Figures (6)

  • Figure 1: Comparison of response-level (Top) and reasoning-level (Bottom) personalization in a rating prediction task with scores from 1 to 5. Our approach generates personalized reasoning paths based on user-specific factors, enabling more accurate and interpretable predictions.
  • Figure 2: Overview of RPM. It extracts user-specific features/factors from user history and constructs reasoning examples by annotating personalized reasoning paths for query-response pairs. At inference time, it retrieves examples and generates the reasoning-aligned output guided by them.
  • Figure 3: Human evaluation on reasoning quality and validity of feature and factor.
  • Figure 4: Performance impact of user context scale. Subfigures (a)–(d) show the effect of varying the proportion of user history, and (e)–(h) show the effect of the number of retrieved examples.
  • Figure 5: The instruction and annotation guidelines provided within the human evaluation interface.
  • ...and 1 more figures