Table of Contents
Fetching ...

A Study on Building Efficient Zero-Shot Relation Extraction Models

Hugo Thomas, Caio Corro, Guillaume Gravier, Pascale Sébillot

TL;DR

This work introduces a typology of existing models, and proposes several strategies to build single pass models and models with a rejection mechanism, showing that no existing work is really robust to realistic assumptions, but overall AlignRE performs best along all criteria.

Abstract

Zero-shot relation extraction aims to identify relations between entity mentions using textual descriptions of novel types (i.e., previously unseen) instead of labeled training examples. Previous works often rely on unrealistic assumptions: (1) pairs of mentions are often encoded directly in the input, which prevents offline pre-computation for large scale document database querying; (2) no rejection mechanism is introduced, biasing the evaluation when using these models in a retrieval scenario where some (and often most) inputs are irrelevant and must be ignored. In this work, we study the robustness of existing zero-shot relation extraction models when adapting them to a realistic extraction scenario. To this end, we introduce a typology of existing models, and propose several strategies to build single pass models and models with a rejection mechanism. We adapt several state-of-the-art tools, and compare them in this challenging setting, showing that no existing work is really robust to realistic assumptions, but overall AlignRE (Li et al., 2024) performs best along all criteria.

A Study on Building Efficient Zero-Shot Relation Extraction Models

TL;DR

This work introduces a typology of existing models, and proposes several strategies to build single pass models and models with a rejection mechanism, showing that no existing work is really robust to realistic assumptions, but overall AlignRE performs best along all criteria.

Abstract

Zero-shot relation extraction aims to identify relations between entity mentions using textual descriptions of novel types (i.e., previously unseen) instead of labeled training examples. Previous works often rely on unrealistic assumptions: (1) pairs of mentions are often encoded directly in the input, which prevents offline pre-computation for large scale document database querying; (2) no rejection mechanism is introduced, biasing the evaluation when using these models in a retrieval scenario where some (and often most) inputs are irrelevant and must be ignored. In this work, we study the robustness of existing zero-shot relation extraction models when adapting them to a realistic extraction scenario. To this end, we introduce a typology of existing models, and propose several strategies to build single pass models and models with a rejection mechanism. We adapt several state-of-the-art tools, and compare them in this challenging setting, showing that no existing work is really robust to realistic assumptions, but overall AlignRE (Li et al., 2024) performs best along all criteria.
Paper Structure (36 sections, 7 equations, 2 figures, 5 tables)

This paper contains 36 sections, 7 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: In relation extraction, the input is an utterance with two identified mentions, a head mention and a tail mention. We assume that the only targeted relation type is operating system. (left) Two input examples. In (1), the model must predict that there is a relation of this type between the two mentions. In (2), the model should reject the input, as the candidate input relation does not correspond to any type in the targeted ones. (right) Example of side-information used for zero-shot relation extraction.
  • Figure 2: Generic illustration of an encoder-only zero-shot RE model.

Theorems & Definitions (2)

  • Definition 1: On-the-fly zero-shot classification
  • Definition 2: Offline encoding