Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session Embedding
Yiruo Cheng, Kelong Mao, Zhicheng Dou
TL;DR
The paper tackles the interpretability gap in conversational dense retrieval by introducing ConvInv, which converts opaque session embeddings into explicit text using a Vec2Text model trained on the ad-hoc query encoder. It further enhances interpretability by incorporating external conversational query rewrites to guide the transformation. Through extensive experiments on QReCC and CAsT datasets with multiple retrievers, ConvInv consistently preserves retrieval performance while yielding more human-interpretable transformed text, and shows robustness across baselines and retrievers. This work bridges dense, latent representations with transparent textual explanations, contributing toward more trustworthy and targetable improvements in conversational search systems.
Abstract
Conversational dense retrieval has shown to be effective in conversational search. However, a major limitation of conversational dense retrieval is their lack of interpretability, hindering intuitive understanding of model behaviors for targeted improvements. This paper presents CONVINV, a simple yet effective approach to shed light on interpretable conversational dense retrieval models. CONVINV transforms opaque conversational session embeddings into explicitly interpretable text while faithfully maintaining their original retrieval performance as much as possible. Such transformation is achieved by training a recently proposed Vec2Text model based on the ad-hoc query encoder, leveraging the fact that the session and query embeddings share the same space in existing conversational dense retrieval. To further enhance interpretability, we propose to incorporate external interpretable query rewrites into the transformation process. Extensive evaluations on three conversational search benchmarks demonstrate that CONVINV can yield more interpretable text and faithfully preserve original retrieval performance than baselines. Our work connects opaque session embeddings with transparent query rewriting, paving the way toward trustworthy conversational search.
