Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

Mykhailo Poliakov; Nadiya Shvai

Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

Mykhailo Poliakov, Nadiya Shvai

TL;DR

This work introduces a new method called Multi-Meta-RAG, which uses database filtering with LLM-extracted metadata to improve the RAG selection of the relevant documents from various sources, relevant to the question.

Abstract

The retrieval-augmented generation (RAG) enables retrieval of relevant information from an external knowledge source and allows large language models (LLMs) to answer queries over previously unseen document collections. However, it was demonstrated that traditional RAG applications perform poorly in answering multi-hop questions, which require retrieving and reasoning over multiple elements of supporting evidence. We introduce a new method called Multi-Meta-RAG, which uses database filtering with LLM-extracted metadata to improve the RAG selection of the relevant documents from various sources, relevant to the question. While database filtering is specific to a set of questions from a particular domain and format, we found out that Multi-Meta-RAG greatly improves the results on the MultiHop-RAG benchmark. The code is available at https://github.com/mxpoliakov/Multi-Meta-RAG.

Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

TL;DR

Abstract

Paper Structure (12 sections, 2 figures, 4 tables)

This paper contains 12 sections, 2 figures, 4 tables.

Introduction
Related works
Multi-Meta-RAG
Extraction of Relevant Query Metadata with the LLM
Improved Chunk Selection using Metadata Filtering
Results
Chunk Retrieval Experiment
LLM Response Generation Experiment
Conclusion
Limitations
Future work
Acknowledgments.

Figures (2)

Figure 1: A naive RAG implementation for MultiHop-RAG queries. RAG selects chunks from articles not asked in the example query, which leads to LLM giving a wrong response.
Figure 2: Multi-Meta-RAG: an improved RAG with database filtering using metadata. Metadata is extracted via secondary LLM. With filtering, we can ensure top-K chunks are always from relevant sources with better chances of getting correct overall responses.

Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

TL;DR

Abstract

Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

Authors

TL;DR

Abstract

Table of Contents

Figures (2)