Table of Contents
Fetching ...

$\textit{BenchIE}^{FL}$ : A Manually Re-Annotated Fact-Based Open Information Extraction Benchmark

Fabrice Lamarche, Philippe Langlais

TL;DR

BenchIE^{FL} addresses reliability gaps in open information extraction benchmarks by manually re-annotating BenchIE with inferential, minimal, and exhaustivity-aware guidelines. A flexible matching framework (AF, LoD, Punc) and a 300-sentence BenchIE^{FL} corpus are introduced, along with new annotation/matching guidelines. Empirical results show BenchIE^{FL} correlates more strongly with downstream tasks (ABQA, C-QA, KBP) than prior benchmarks, and older, rule-based extractors remain competitive; neural models tend to produce longer, noisier extractions. The work provides practical resources for fairer OIE evaluation and highlights the importance of task-aligned benchmarks for guiding extractor development and selection.

Abstract

Open Information Extraction (OIE) is a field of natural language processing that aims to present textual information in a format that allows it to be organized, analyzed and reflected upon. Numerous OIE systems are developed, claiming ever-increasing performance, marking the need for objective benchmarks. BenchIE is the latest reference we know of. Despite being very well thought out, we noticed a number of issues we believe are limiting. Therefore, we propose $\textit{BenchIE}^{FL}$, a new OIE benchmark which fully enforces the principles of BenchIE while containing fewer errors, omissions and shortcomings when candidate facts are matched towards reference ones. $\textit{BenchIE}^{FL}$ allows insightful conclusions to be drawn on the actual performance of OIE extractors.

$\textit{BenchIE}^{FL}$ : A Manually Re-Annotated Fact-Based Open Information Extraction Benchmark

TL;DR

BenchIE^{FL} addresses reliability gaps in open information extraction benchmarks by manually re-annotating BenchIE with inferential, minimal, and exhaustivity-aware guidelines. A flexible matching framework (AF, LoD, Punc) and a 300-sentence BenchIE^{FL} corpus are introduced, along with new annotation/matching guidelines. Empirical results show BenchIE^{FL} correlates more strongly with downstream tasks (ABQA, C-QA, KBP) than prior benchmarks, and older, rule-based extractors remain competitive; neural models tend to produce longer, noisier extractions. The work provides practical resources for fairer OIE evaluation and highlights the importance of task-aligned benchmarks for guiding extractor development and selection.

Abstract

Open Information Extraction (OIE) is a field of natural language processing that aims to present textual information in a format that allows it to be organized, analyzed and reflected upon. Numerous OIE systems are developed, claiming ever-increasing performance, marking the need for objective benchmarks. BenchIE is the latest reference we know of. Despite being very well thought out, we noticed a number of issues we believe are limiting. Therefore, we propose , a new OIE benchmark which fully enforces the principles of BenchIE while containing fewer errors, omissions and shortcomings when candidate facts are matched towards reference ones. allows insightful conclusions to be drawn on the actual performance of OIE extractors.
Paper Structure (72 sections, 2 figures, 45 tables)

This paper contains 72 sections, 2 figures, 45 tables.

Figures (2)

  • Figure 1: Downstream tasks flowchart.
  • Figure 2: System performance by benchmarks, scored using default scoring function of each benchmark.