Table of Contents
Fetching ...

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama

TL;DR

Emdr$^2$ addresses end-to-end training for retrieval-augmented open-domain QA by modeling retrieved-document sets as latent variables and applying an EM-inspired objective that jointly updates a dual-encoder retriever and a Fusion-in-Decoder reader. It uses two latent-variable estimates—the reader-based prior and a posterior-informed retriever signal—to guide learning, with a tractable approximation over top-$K$ documents and a stop-gradient design to stabilize training. The approach achieves state-of-the-art performance on Natural Questions, TriviaQA, and WebQuestions with a base-size model and demonstrates robustness to retriever initialization, including unsupervised pre-training (MSS). Beyond strong results, the work provides a practical end-to-end framework for latent-variable training in retrieval-augmented generation and offers insights into initialization, ablations, and alternative objectives for such systems.

Abstract

We present an end-to-end differentiable training method for retrieval-augmented open-domain question answering systems that combine information from multiple retrieved documents when generating answers. We model retrieval decisions as latent variables over sets of relevant documents. Since marginalizing over sets of retrieved documents is computationally hard, we approximate this using an expectation-maximization algorithm. We iteratively estimate the value of our latent variable (the set of relevant documents for a given question) and then use this estimate to update the retriever and reader parameters. We hypothesize that such end-to-end training allows training signals to flow to the reader and then to the retriever better than staged-wise training. This results in a retriever that is able to select more relevant documents for a question and a reader that is trained on more accurate documents to generate an answer. Experiments on three benchmark datasets demonstrate that our proposed method outperforms all existing approaches of comparable size by 2-3% absolute exact match points, achieving new state-of-the-art results. Our results also demonstrate the feasibility of learning to retrieve to improve answer generation without explicit supervision of retrieval decisions.

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

TL;DR

Emdr addresses end-to-end training for retrieval-augmented open-domain QA by modeling retrieved-document sets as latent variables and applying an EM-inspired objective that jointly updates a dual-encoder retriever and a Fusion-in-Decoder reader. It uses two latent-variable estimates—the reader-based prior and a posterior-informed retriever signal—to guide learning, with a tractable approximation over top- documents and a stop-gradient design to stabilize training. The approach achieves state-of-the-art performance on Natural Questions, TriviaQA, and WebQuestions with a base-size model and demonstrates robustness to retriever initialization, including unsupervised pre-training (MSS). Beyond strong results, the work provides a practical end-to-end framework for latent-variable training in retrieval-augmented generation and offers insights into initialization, ablations, and alternative objectives for such systems.

Abstract

We present an end-to-end differentiable training method for retrieval-augmented open-domain question answering systems that combine information from multiple retrieved documents when generating answers. We model retrieval decisions as latent variables over sets of relevant documents. Since marginalizing over sets of retrieved documents is computationally hard, we approximate this using an expectation-maximization algorithm. We iteratively estimate the value of our latent variable (the set of relevant documents for a given question) and then use this estimate to update the retriever and reader parameters. We hypothesize that such end-to-end training allows training signals to flow to the reader and then to the retriever better than staged-wise training. This results in a retriever that is able to select more relevant documents for a question and a reader that is trained on more accurate documents to generate an answer. Experiments on three benchmark datasets demonstrate that our proposed method outperforms all existing approaches of comparable size by 2-3% absolute exact match points, achieving new state-of-the-art results. Our results also demonstrate the feasibility of learning to retrieve to improve answer generation without explicit supervision of retrieval decisions.

Paper Structure

This paper contains 43 sections, 10 equations, 4 figures, 10 tables, 1 algorithm.

Figures (4)

  • Figure 1: An illustration of the different components of Emdr$^2$. Colored blocks indicate components which contain trainable parameters.
  • Figure 2: Performance on NQ, TriviaQA, and WebQ as we vary the number of retrieved documents.
  • Figure 3: Reader and retriever training losses when the model is initialized with MSS pre-training.
  • Figure 4: Reader training loss vs steps for NQ, TriviaQA, and WebQ when the retriever is either initialized by MSS pre-training or by MSS followed by supervised DPR training (MSS + DPR).