Table of Contents
Fetching ...

Robust Retrieval Augmented Generation for Zero-shot Slot Filling

Michael Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Alfio Gliozzo

TL;DR

This work tackles zero-shot slot filling by integrating dense passage retrieval with hard negative mining and end-to-end retrieval-augmented generation. The proposed Knowledge Graph Induction (KGI) framework combines a task-tailored DPR for evidence retrieval with a RAG-based generator, enhanced by Dense Negative Sampling to improve retrieval robustness. On KILT benchmarks (T-REx and zsRE), KGI_1 delivers substantial improvements over baselines and achieves top leaderboard performance, while also enabling zero-shot domain adaptation to TACRED and rapid few-shot gains. The authors provide extensive hyperparameter details, report model scales (620M parameters), and release code and pre-trained models to support reproducibility and further research in slot filling and related knowledge-intensive tasks.

Abstract

Automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in AI. One way to make headway for this problem is through advancements in a related task known as slot filling. In this task, given an entity query in form of [Entity, Slot, ?], a system is asked to fill the slot by generating or extracting the missing value exploiting evidence extracted from relevant passage(s) in the given document collection. The recent works in the field try to solve this task in an end-to-end fashion using retrieval-based language models. In this paper, we present a novel approach to zero-shot slot filling that extends dense passage retrieval with hard negatives and robust training procedures for retrieval augmented generation models. Our model reports large improvements on both T-REx and zsRE slot filling datasets, improving both passage retrieval and slot value generation, and ranking at the top-1 position in the KILT leaderboard. Moreover, we demonstrate the robustness of our system showing its domain adaptation capability on a new variant of the TACRED dataset for slot filling, through a combination of zero/few-shot learning. We release the source code and pre-trained models.

Robust Retrieval Augmented Generation for Zero-shot Slot Filling

TL;DR

This work tackles zero-shot slot filling by integrating dense passage retrieval with hard negative mining and end-to-end retrieval-augmented generation. The proposed Knowledge Graph Induction (KGI) framework combines a task-tailored DPR for evidence retrieval with a RAG-based generator, enhanced by Dense Negative Sampling to improve retrieval robustness. On KILT benchmarks (T-REx and zsRE), KGI_1 delivers substantial improvements over baselines and achieves top leaderboard performance, while also enabling zero-shot domain adaptation to TACRED and rapid few-shot gains. The authors provide extensive hyperparameter details, report model scales (620M parameters), and release code and pre-trained models to support reproducibility and further research in slot filling and related knowledge-intensive tasks.

Abstract

Automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in AI. One way to make headway for this problem is through advancements in a related task known as slot filling. In this task, given an entity query in form of [Entity, Slot, ?], a system is asked to fill the slot by generating or extracting the missing value exploiting evidence extracted from relevant passage(s) in the given document collection. The recent works in the field try to solve this task in an end-to-end fashion using retrieval-based language models. In this paper, we present a novel approach to zero-shot slot filling that extends dense passage retrieval with hard negatives and robust training procedures for retrieval augmented generation models. Our model reports large improvements on both T-REx and zsRE slot filling datasets, improving both passage retrieval and slot value generation, and ranking at the top-1 position in the KILT leaderboard. Moreover, we demonstrate the robustness of our system showing its domain adaptation capability on a new variant of the TACRED dataset for slot filling, through a combination of zero/few-shot learning. We release the source code and pre-trained models.

Paper Structure

This paper contains 21 sections, 1 equation, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Slot Filling task
  • Figure 2: $\text{KGI}_{}$ Architecture
  • Figure 3: DPR Training
  • Figure 4: RAG Architecture
  • Figure 5: Performance as a function of entity frequency