Table of Contents
Fetching ...

A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions

Stefanie Schneider, Miriam Göldl, Julian Stalter, Ricarda Vollmer

TL;DR

FRAME is introduced, a manually annotated dataset of art-historical image descriptions for Named Entity Recognition (NER) and Relation Extraction (RE) that can be used to benchmark and fine-tune NER and RE systems, including zero- and few-shot setups with Large Language Models (LLMs).

Abstract

This paper introduces FRAME (Fine-grained Recognition of Art-historical Metadata and Entities), a manually annotated dataset of art-historical image descriptions for Named Entity Recognition (NER) and Relation Extraction (RE). Descriptions were collected from museum catalogs, auction listings, open-access platforms, and scholarly databases, then filtered to ensure that each text focuses on a single artwork and contains explicit statements about its material, composition, or iconography. FRAME provides stand-off annotations in three layers: a metadata layer for object-level properties, a content layer for depicted subjects and motifs, and a co-reference layer linking repeated mentions. Across layers, entity spans are labeled with 37 types and connected by typed RE links between mentions. Entity types are aligned with Wikidata to support Named Entity Linking (NEL) and downstream knowledge-graph construction. The dataset is released as UIMA XMI Common Analysis Structure (CAS) files with accompanying images and bibliographic metadata, and can be used to benchmark and fine-tune NER and RE systems, including zero- and few-shot setups with Large Language Models (LLMs).

A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions

TL;DR

FRAME is introduced, a manually annotated dataset of art-historical image descriptions for Named Entity Recognition (NER) and Relation Extraction (RE) that can be used to benchmark and fine-tune NER and RE systems, including zero- and few-shot setups with Large Language Models (LLMs).

Abstract

This paper introduces FRAME (Fine-grained Recognition of Art-historical Metadata and Entities), a manually annotated dataset of art-historical image descriptions for Named Entity Recognition (NER) and Relation Extraction (RE). Descriptions were collected from museum catalogs, auction listings, open-access platforms, and scholarly databases, then filtered to ensure that each text focuses on a single artwork and contains explicit statements about its material, composition, or iconography. FRAME provides stand-off annotations in three layers: a metadata layer for object-level properties, a content layer for depicted subjects and motifs, and a co-reference layer linking repeated mentions. Across layers, entity spans are labeled with 37 types and connected by typed RE links between mentions. Entity types are aligned with Wikidata to support Named Entity Linking (NEL) and downstream knowledge-graph construction. The dataset is released as UIMA XMI Common Analysis Structure (CAS) files with accompanying images and bibliographic metadata, and can be used to benchmark and fine-tune NER and RE systems, including zero- and few-shot setups with Large Language Models (LLMs).
Paper Structure (17 sections, 3 figures, 3 tables)

This paper contains 17 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Each record in the dataset includes (1) the referenced artwork image, (2) basic artwork metadata, and (3) an art-historical text excerpt labeled with NER and RE annotations.
  • Figure 2: Our dataset creation process involves a modular, multi-stage pipeline, integrating both manual and automated components.
  • Figure 3: Four-step annotation procedure. (a) First, the text is read in full to obtain an overview, without creating annotations. (b) Second, expressions belonging to the metadata layer are annotated. (c) Third, expressions in the content layer are annotated. (d) Fourth and finally, co-references are annotated.