Table of Contents
Fetching ...

Faithful Temporal Question Answering over Heterogeneous Sources

Zhen Jia, Philipp Christmann, Gerhard Weikum

TL;DR

This work tackles temporal question answering with explicit and implicit time constraints across heterogeneous sources (KB, text, and tables). It introduces FAITH, a three-stage pipeline comprising Temporal Question Understanding, Faithful Evidence Retrieval, and Explainable Heterogeneous Answering, with implicit constraints resolved through recursive intermediate questions. A key contribution is the Tiq benchmark, automatically built to probe implicit temporal reasoning across diverse sources. Empirical results show FAITH achieves superior faithful answering on Tiq and strong performance on TimeQuestions, outperforming LLMs and unfaithful baselines while providing transparent evidence for trust and explainability.

Abstract

Temporal question answering (QA) involves time constraints, with phrases such as "... in 2019" or "... before COVID". In the former, time is an explicit condition, in the latter it is implicit. State-of-the-art methods have limitations along three dimensions. First, with neural inference, time constraints are merely soft-matched, giving room to invalid or inexplicable answers. Second, questions with implicit time are poorly supported. Third, answers come from a single source: either a knowledge base (KB) or a text corpus. We propose a temporal QA system that addresses these shortcomings. First, it enforces temporal constraints for faithful answering with tangible evidence. Second, it properly handles implicit questions. Third, it operates over heterogeneous sources, covering KB, text and web tables in a unified manner. The method has three stages: (i) understanding the question and its temporal conditions, (ii) retrieving evidence from all sources, and (iii) faithfully answering the question. As implicit questions are sparse in prior benchmarks, we introduce a principled method for generating diverse questions. Experiments show superior performance over a suite of baselines.

Faithful Temporal Question Answering over Heterogeneous Sources

TL;DR

This work tackles temporal question answering with explicit and implicit time constraints across heterogeneous sources (KB, text, and tables). It introduces FAITH, a three-stage pipeline comprising Temporal Question Understanding, Faithful Evidence Retrieval, and Explainable Heterogeneous Answering, with implicit constraints resolved through recursive intermediate questions. A key contribution is the Tiq benchmark, automatically built to probe implicit temporal reasoning across diverse sources. Empirical results show FAITH achieves superior faithful answering on Tiq and strong performance on TimeQuestions, outperforming LLMs and unfaithful baselines while providing transparent evidence for trust and explainability.

Abstract

Temporal question answering (QA) involves time constraints, with phrases such as "... in 2019" or "... before COVID". In the former, time is an explicit condition, in the latter it is implicit. State-of-the-art methods have limitations along three dimensions. First, with neural inference, time constraints are merely soft-matched, giving room to invalid or inexplicable answers. Second, questions with implicit time are poorly supported. Third, answers come from a single source: either a knowledge base (KB) or a text corpus. We propose a temporal QA system that addresses these shortcomings. First, it enforces temporal constraints for faithful answering with tangible evidence. Second, it properly handles implicit questions. Third, it operates over heterogeneous sources, covering KB, text and web tables in a unified manner. The method has three stages: (i) understanding the question and its temporal conditions, (ii) retrieving evidence from all sources, and (iii) faithfully answering the question. As implicit questions are sparse in prior benchmarks, we introduce a principled method for generating diverse questions. Experiments show superior performance over a suite of baselines.
Paper Structure (24 sections, 4 figures, 10 tables)

This paper contains 24 sections, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Overview of the Faith pipeline. The figure illustrates the process for answering $q_3$ ("Queen’s record company when recording Bohemian Rhapsody?") and $q_1$ ("Record company of Queen in 1975?"). For answering $q_3$, two intermediate questions $q_{31}$ and $q_{32}$ are generated, and run recursively through the entire temporal QA system.
  • Figure 2: Steps to create implicit questions with our proposed methodology, highlighting the key configurable parts.
  • Figure 3: Distribution of questions over input source combinations (source for main part ; source for implicit part).
  • Figure 4: P@1 of Faith when considering top-$k$ answers for the generated intermediate question(s) of implicit questions.