Analyzing the Effectiveness of Listwise Reranking with Positional Invariance on Temporal Generalizability
Soyoung Yoon, Jongyoon Kim, Seung-won Hwang
TL;DR
This paper investigates how listwise reranking, particularly ListT5 with Fusion-in-Decoder, performs under temporal distribution shifts in a real-world dynamic setting. By leveraging the LongEval benchmark, the study shows that listwise reranking can significantly improve temporal generalizability and mitigate positional bias, with pronounced gains as temporal drift increases (notably on the test-long subset). The authors compare pointwise and listwise approaches and demonstrate that ListT5 outperforms MonoT5 across metrics in zero-shot, drifted scenarios, supporting the viability of robust, time-aware ranking in dynamic information environments. The work emphasizes practical deployment implications, including efficient architectures (FiD-based ListT5), transparent evaluation using a proxy metric, and careful data preprocessing to reflect real-world noisy corpora.
Abstract
This working note outlines our participation in the retrieval task at CLEF 2024. We highlight the considerable gap between studying retrieval performance on static knowledge documents and understanding performance in real-world environments. Therefore, Addressing these discrepancies and measuring the temporal persistence of IR systems is crucial. By investigating the LongEval benchmark, specifically designed for such dynamic environments, our findings demonstrate the effectiveness of a listwise reranking approach, which proficiently handles inaccuracies induced by temporal distribution shifts. Among listwise rerankers, our findings show that ListT5, which effectively mitigates the positional bias problem by adopting the Fusion-in-Decoder architecture, is especially effective, and more so, as temporal drift increases, on the test-long subset.
