Table of Contents
Fetching ...

Assessing the quality of information extraction

Filip Seitl, Tomáš Kovářík, Soheyla Mirshahi, Jan Kryštůfek, Rastislav Dujava, Matúš Ondreička, Herbert Ullrich, Petr Gronat

TL;DR

The paper tackles the challenge of evaluating information extraction quality when labeled data is scarce by introducing a needle-based synthetic ground truth and the MINEA score. It proposes a schema-driven approach to structure extracted information and analyzes how length constraints and the Lost-in-the-middle phenomenon affect IE via iterative, piecewise extraction. The study demonstrates an automatic, domain-adaptable evaluation framework that enables objective quality assessment without manual labeling and compares LLMs using MINEA on a healthcare business corpus. The work advances practical IE evaluation by enabling robust, scalable assessment without extensive manual annotation, and it provides actionable guidance on iteration strategies and model selection.

Abstract

Advances in large language models have notably enhanced the efficiency of information extraction from unstructured and semi-structured data sources. As these technologies become integral to various applications, establishing an objective measure for the quality of information extraction becomes imperative. However, the scarcity of labeled data presents significant challenges to this endeavor. In this paper, we introduce an automatic framework to assess the quality of the information extraction/retrieval and its completeness. The framework focuses on information extraction in the form of entity and its properties. We discuss how to handle the input/output size limitations of the large language models and analyze their performance when extracting the information. In particular, we introduce scores to evaluate the quality of the extraction and provide an extensive discussion on how to interpret them.

Assessing the quality of information extraction

TL;DR

The paper tackles the challenge of evaluating information extraction quality when labeled data is scarce by introducing a needle-based synthetic ground truth and the MINEA score. It proposes a schema-driven approach to structure extracted information and analyzes how length constraints and the Lost-in-the-middle phenomenon affect IE via iterative, piecewise extraction. The study demonstrates an automatic, domain-adaptable evaluation framework that enables objective quality assessment without manual labeling and compares LLMs using MINEA on a healthcare business corpus. The work advances practical IE evaluation by enabling robust, scalable assessment without extensive manual annotation, and it provides actionable guidance on iteration strategies and model selection.

Abstract

Advances in large language models have notably enhanced the efficiency of information extraction from unstructured and semi-structured data sources. As these technologies become integral to various applications, establishing an objective measure for the quality of information extraction becomes imperative. However, the scarcity of labeled data presents significant challenges to this endeavor. In this paper, we introduce an automatic framework to assess the quality of the information extraction/retrieval and its completeness. The framework focuses on information extraction in the form of entity and its properties. We discuss how to handle the input/output size limitations of the large language models and analyze their performance when extracting the information. In particular, we introduce scores to evaluate the quality of the extraction and provide an extensive discussion on how to interpret them.
Paper Structure (15 sections, 1 equation, 8 figures, 6 tables)

This paper contains 15 sections, 1 equation, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Toy example: structured information encapsulating three entities using schema.org.
  • Figure 2: Toy example: two needles, highlighted by blue color, accompanied by additional information described by 'name', 'description', and 'keywords'.
  • Figure 3: Toy example: extracted information from the data infused by needles from Figure \ref{['fig:needles_ex']}.
  • Figure 4: Prompt to determine a possible suitable schema from a given text -- Wikipedia article about IE.
  • Figure 5: Schema.org types found by an LLM within Wikipedia article about IE.
  • ...and 3 more figures