Table of Contents
Fetching ...

TempRetriever: Fusion-based Temporal Dense Passage Retrieval for Time-Sensitive Questions

Abdelrahman Abdallah, Bhawna Piryani, Jonas Wallat, Avishek Anand, Adam Jatowt

TL;DR

TempRetriever tackles time-sensitive QA by embedding temporal context directly into dense passage representations. By comparing three temporal encoding schemes (DateAsTag, DateAsToken, and a full temporal embedding approach) and four fusion methods (FS, VS, RE, EWI), it achieves superior retrieval performance on ArchivalQA and ChroniclingAmericaQA, even when paired with other temporal models in hybrid setups. A novel time-aware negative sampling strategy and a RAG integration demonstrate practical gains in both retrieval accuracy and downstream answer generation. The work substantiates the importance of temporal signals in dense retrieval and paves the way for broader time-aware QA across archival, legal, and historical domains.

Abstract

Temporal awareness is crucial in many information retrieval tasks, particularly in scenarios where the relevance of documents depends on their alignment with the query's temporal context. Traditional approaches such as BM25 and Dense Passage Retrieval (DPR) focus on lexical or semantic similarity but tend to neglect the temporal alignment between queries and documents, which is essential for time-sensitive tasks like temporal question answering (TQA). We propose TempRetriever, a novel extension of DPR that explicitly incorporates temporal information by embedding both the query date and document timestamp into the retrieval process. This allows retrieving passages that are not only contextually relevant but also aligned with the temporal intent of queries. We evaluate TempRetriever on two large-scale datasets ArchivalQA and ChroniclingAmericaQA demonstrating its superiority over baseline retrieval models across multiple metrics. TempRetriever achieves a 6.63\% improvement in Top-1 retrieval accuracy and a 3.79\% improvement in NDCG@10 compared to the standard DPR on ArchivalQA. Similarly, for ChroniclingAmericaQA, TempRetriever exhibits a 9.56\% improvement in Top-1 retrieval accuracy and a 4.68\% improvement in NDCG@10. We also propose a novel, time-based negative sampling strategy which further enhances retrieval performance by addressing temporal misalignment during training. Our results underline the importance of temporal aspects in dense retrieval systems and establish a new benchmark for time-aware passage retrieval.

TempRetriever: Fusion-based Temporal Dense Passage Retrieval for Time-Sensitive Questions

TL;DR

TempRetriever tackles time-sensitive QA by embedding temporal context directly into dense passage representations. By comparing three temporal encoding schemes (DateAsTag, DateAsToken, and a full temporal embedding approach) and four fusion methods (FS, VS, RE, EWI), it achieves superior retrieval performance on ArchivalQA and ChroniclingAmericaQA, even when paired with other temporal models in hybrid setups. A novel time-aware negative sampling strategy and a RAG integration demonstrate practical gains in both retrieval accuracy and downstream answer generation. The work substantiates the importance of temporal signals in dense retrieval and paves the way for broader time-aware QA across archival, legal, and historical domains.

Abstract

Temporal awareness is crucial in many information retrieval tasks, particularly in scenarios where the relevance of documents depends on their alignment with the query's temporal context. Traditional approaches such as BM25 and Dense Passage Retrieval (DPR) focus on lexical or semantic similarity but tend to neglect the temporal alignment between queries and documents, which is essential for time-sensitive tasks like temporal question answering (TQA). We propose TempRetriever, a novel extension of DPR that explicitly incorporates temporal information by embedding both the query date and document timestamp into the retrieval process. This allows retrieving passages that are not only contextually relevant but also aligned with the temporal intent of queries. We evaluate TempRetriever on two large-scale datasets ArchivalQA and ChroniclingAmericaQA demonstrating its superiority over baseline retrieval models across multiple metrics. TempRetriever achieves a 6.63\% improvement in Top-1 retrieval accuracy and a 3.79\% improvement in NDCG@10 compared to the standard DPR on ArchivalQA. Similarly, for ChroniclingAmericaQA, TempRetriever exhibits a 9.56\% improvement in Top-1 retrieval accuracy and a 4.68\% improvement in NDCG@10. We also propose a novel, time-based negative sampling strategy which further enhances retrieval performance by addressing temporal misalignment during training. Our results underline the importance of temporal aspects in dense retrieval systems and establish a new benchmark for time-aware passage retrieval.

Paper Structure

This paper contains 21 sections, 8 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: A schematic diagram illustrating an overview of training used in different approaches: (a) VanillaDPR approach, (b) DateAsTag approach, (c) DateAsToken approach, and (d) TempRetriever. Due to space limitation we only show here the feature stacking technique for the TempRetriever approach. Note that the positive passage's publication date (PD) may differ from the date in the query, as demonstrated also in the figure's example.
  • Figure 2: Top-k accuracy, MAP, and nDCG Metrics with the different number of negative documents included during training for ChroniclingAmericaQA.
  • Figure 3: Comparison of nDCG for implicit questions across different retrieval models on the ArchivalQA dataset. Each subplot represents a specific retrieval model, showing performance for validation and test datasets at varying cut-off points (nDCG@k).
  • Figure 4: A simple unified framework for handling different questions types using a Query Router.