Table of Contents
Fetching ...

Bridging Personalization and Control in Scientific Personalized Search

Sheshera Mysore, Garima Dhanania, Kishor Patil, Surya Kallumadi, Andrew McCallum, Hamed Zamani

TL;DR

The study targets the opacity and limited user control in personalized search, particularly in scientific domains, and introduces CtrlCE, a memory-augmented cross-encoder that blends a query-document embedding score with a memory-based user score via a calibrated mixing model to decide when personalization should apply. CtrlCE integrates two editable memory designs—concept-value memories and item memories—with an embedding cross-encoder so that a multi-vector memory can influence rankings through a calibrated weight $w$, yielding $s_d = w \cdot s_d^q + (1-w) \cdot s_d^u$, where $f_{CE}(q,d) = \mathbf{q}^T \mathbf{d}$ is the cross-encoder score and $f_{Mem}$ represents memory interactions. The training uses a two-stage procedure (stage-1: train $\texttt{Enc}_{\text{CE}}$; stage-2: train $g_{Mix}$ with a scale-calibrated loss and an anchor $y_0$) to ensure $w$ tracks CE performance and can signal when user edits are beneficial. Across four scientific domains, calibration analyses, and a user study, CtrlCE achieves 6.4–10.6% improvements in NDCG@10 over strong baselines while enabling explicit no-personalization and selective personalization, supporting practical controllability and readability of user profiles.

Abstract

Personalized search is a problem where models benefit from learning user preferences from per-user historical interaction data. The inferred preferences enable personalized ranking models to improve the relevance of documents for users. However, personalization is also seen as opaque in its use of historical interactions and is not amenable to users' control. Further, personalization limits the diversity of information users are exposed to. While search results may be automatically diversified this does little to address the lack of control over personalization. In response, we introduce a model for personalized search that enables users to control personalized rankings proactively. Our model, CtrlCE, is a novel cross-encoder model augmented with an editable memory built from users' historical interactions. The editable memory allows cross-encoders to be personalized efficiently and enables users to control personalized ranking. Next, because all queries do not require personalization, we introduce a calibrated mixing model which determines when personalization is necessary. This enables users to control personalization via their editable memory only when necessary. To thoroughly evaluate CtrlCE, we demonstrate its empirical performance in four domains of science, its ability to selectively request user control in a calibration evaluation of the mixing model, and the control provided by its editable memory in a user study.

Bridging Personalization and Control in Scientific Personalized Search

TL;DR

The study targets the opacity and limited user control in personalized search, particularly in scientific domains, and introduces CtrlCE, a memory-augmented cross-encoder that blends a query-document embedding score with a memory-based user score via a calibrated mixing model to decide when personalization should apply. CtrlCE integrates two editable memory designs—concept-value memories and item memories—with an embedding cross-encoder so that a multi-vector memory can influence rankings through a calibrated weight , yielding , where is the cross-encoder score and represents memory interactions. The training uses a two-stage procedure (stage-1: train ; stage-2: train with a scale-calibrated loss and an anchor ) to ensure tracks CE performance and can signal when user edits are beneficial. Across four scientific domains, calibration analyses, and a user study, CtrlCE achieves 6.4–10.6% improvements in NDCG@10 over strong baselines while enabling explicit no-personalization and selective personalization, supporting practical controllability and readability of user profiles.

Abstract

Personalized search is a problem where models benefit from learning user preferences from per-user historical interaction data. The inferred preferences enable personalized ranking models to improve the relevance of documents for users. However, personalization is also seen as opaque in its use of historical interactions and is not amenable to users' control. Further, personalization limits the diversity of information users are exposed to. While search results may be automatically diversified this does little to address the lack of control over personalization. In response, we introduce a model for personalized search that enables users to control personalized rankings proactively. Our model, CtrlCE, is a novel cross-encoder model augmented with an editable memory built from users' historical interactions. The editable memory allows cross-encoders to be personalized efficiently and enables users to control personalized ranking. Next, because all queries do not require personalization, we introduce a calibrated mixing model which determines when personalization is necessary. This enables users to control personalization via their editable memory only when necessary. To thoroughly evaluate CtrlCE, we demonstrate its empirical performance in four domains of science, its ability to selectively request user control in a calibration evaluation of the mixing model, and the control provided by its editable memory in a user study.

Paper Structure

This paper contains 30 sections, 5 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Our approach CtrlCE, augments a cross-encoder with an editable user profile using a calibrated mixing model. Our training procedure ensures that the mixing models score $w$ remains proportional to the performance of $f_{\text{CE}}$. This ensured that it can be used for seeking edits to a user profile only when necessary.
  • Figure 2: Concept-value memories represent users with concepts and their personalized concept values. Item memories directly represent users with item representations.
  • Figure 3: Scores produced by the mixing model $g_{\text{Mix}}$ used to combine $f_{\text{CE}}$ and $f_{\text{Mem}}$ (Equation \ref{['eq-edps-highlevelscore']}) plotted against the NDCG@10 for $f_{\text{CE}}$. CtrlCE$_{\text{CV}}$ (blue) is compared against a model trained without a calibrated objective (pink). Our calibrated objective ensures that $g_{\text{Mix}}$ scores are proportional to $f_{\text{CE}}$ performance. CtrlCE$_{\text{It}}$ shows identical trends.