Table of Contents
Fetching ...

LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization

Masafumi Enomoto, Kunihiro Takeoka, Kosuke Akimoto, Kiril Gashteovski, Masafumi Oyamada

TL;DR

LightPAL is proposed, a lightweight passage retrieval method for ODMDS that leverages an LLM to pre-construct a graph representing passage relationships, then employs random walk during retrieval, avoiding iterative LLM inference.

Abstract

Open-Domain Multi-Document Summarization (ODMDS) is the task of generating summaries from large document collections in response to user queries. This task is crucial for efficiently addressing diverse information needs from users. Traditional retrieve-then-summarize approaches fall short for open-ended queries in ODMDS tasks. These queries often require broader context than initially retrieved passages provide, making it challenging to retrieve all relevant information in a single search. While iterative retrieval methods has been explored for multi-hop question answering (MQA), it's impractical for ODMDS due to high latency from repeated LLM inference. Accordingly, we propose LightPAL, a lightweight passage retrieval method for ODMDS. LightPAL leverages an LLM to pre-construct a graph representing passage relationships, then employs random walk during retrieval, avoiding iterative LLM inference. Experiments demonstrate that LightPAL outperforms naive sparse and pre-trained dense retrievers in both retrieval and summarization metrics, while achieving higher efficiency compared to iterative MQA approaches.

LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization

TL;DR

LightPAL is proposed, a lightweight passage retrieval method for ODMDS that leverages an LLM to pre-construct a graph representing passage relationships, then employs random walk during retrieval, avoiding iterative LLM inference.

Abstract

Open-Domain Multi-Document Summarization (ODMDS) is the task of generating summaries from large document collections in response to user queries. This task is crucial for efficiently addressing diverse information needs from users. Traditional retrieve-then-summarize approaches fall short for open-ended queries in ODMDS tasks. These queries often require broader context than initially retrieved passages provide, making it challenging to retrieve all relevant information in a single search. While iterative retrieval methods has been explored for multi-hop question answering (MQA), it's impractical for ODMDS due to high latency from repeated LLM inference. Accordingly, we propose LightPAL, a lightweight passage retrieval method for ODMDS. LightPAL leverages an LLM to pre-construct a graph representing passage relationships, then employs random walk during retrieval, avoiding iterative LLM inference. Experiments demonstrate that LightPAL outperforms naive sparse and pre-trained dense retrievers in both retrieval and summarization metrics, while achieving higher efficiency compared to iterative MQA approaches.
Paper Structure (16 sections, 1 equation, 7 figures, 6 tables)

This paper contains 16 sections, 1 equation, 7 figures, 6 tables.

Figures (7)

  • Figure 1: LightPAL: lightweight retrieval method for Open-domain Multi-Document Summarization (ODMDS). It consists of two main processes: 1) Offline Graph Construction: evaluates passage relevance using LLM conditional generation probabilities to create edges between passages. 2) Online Retrieval: uses an off-the-shelf retriever for initial passages, then performs a Random Walk on the constructed graph to retrieve informative context passages referenced by many others. This approach efficiently retrieves context passages without iterative LLM inference in runtime.
  • Figure 2: Summary quality plot for samples with improved retrieval by LightPAL (Improved Set) versus those without (Non-improved Set). Each group represents a combination of dataset and base retriever, with the percentage of samples showing improved retrieval performance noted below.
  • Figure 3: Case study of retrieved passages using LightPAL and a naive base retriever.
  • Figure 4: Annotation instruction
  • Figure 5: Annotation interface
  • ...and 2 more figures