RALI@TREC iKAT 2024: Achieving Personalization via Retrieval Fusion in Conversational Search
Yuchen Hui, Fengran Mo, Milan Mao, Jian-Yun Nie
TL;DR
The paper tackles personalization in conversational information retrieval and the problem of over-personalization by proposing retrieval fusion to merge rankings from differently personalized, LLM-assisted reformulations. The authors implement both manual and automatic two-stage retrieval pipelines, using BM25 for retrieval, reranking with MonoT5 or RankLlama, and, in one automatic configuration, GPT-4o-driven query rewriting with a fusion rule $S_{final}(D)=\alpha_1 S_1(D)+\alpha_2 S_2(D)+\alpha_3 S_3(D)$ to combine signals from de-contextualized non-personalized, de-contextualized expanded, and de-contextualized personalized queries; a subsequent reranking step and response generation with GPT-4o are included in one variant. Evaluation on iKAT 2024 shows that the fusion-plus-reranking approach achieves the best passage-ranking performance among submissions, while automatic methods reveal biases related to pooling when compared to manual rewrites; RankLlama’s strong year-over-year performance is attributed to assessment bias. The study demonstrates the practicality of retrieval fusion for personalized CIR and highlights evaluation biases in pool-based shared tasks, suggesting directions for more robust fusion strategies and test-collection reuse in future work.
Abstract
The Recherche Appliquee en Linguistique Informatique (RALI) team participated in the 2024 TREC Interactive Knowledge Assistance (iKAT) Track. In personalized conversational search, effectively capturing a user's complex search intent requires incorporating both contextual information and key elements from the user profile into query reformulation. The user profile often contains many relevant pieces, and each could potentially complement the user's information needs. It is difficult to disregard any of them, whereas introducing an excessive number of these pieces risks drifting from the original query and hinders search performance. This is a challenge we denote as over-personalization. To address this, we propose different strategies by fusing ranking lists generated from the queries with different levels of personalization.
