Table of Contents
Fetching ...

EUFCC-CIR: a Composed Image Retrieval Dataset for GLAM Collections

Francesc Net, Lluis Gomez

TL;DR

The value of the EUFCC-CIR dataset is demonstrated by highlighting its unique qualities in comparison to other existing CIR datasets and evaluating the performance of several zero-shot CIR baselines.

Abstract

The intersection of Artificial Intelligence and Digital Humanities enables researchers to explore cultural heritage collections with greater depth and scale. In this paper, we present EUFCC-CIR, a dataset designed for Composed Image Retrieval (CIR) within Galleries, Libraries, Archives, and Museums (GLAM) collections. Our dataset is built on top of the EUFCC-340K image labeling dataset and contains over 180K annotated CIR triplets. Each triplet is composed of a multi-modal query (an input image plus a short text describing the desired attribute manipulations) and a set of relevant target images. The EUFCC-CIR dataset fills an existing gap in CIR-specific resources for Digital Humanities. We demonstrate the value of the EUFCC-CIR dataset by highlighting its unique qualities in comparison to other existing CIR datasets and evaluating the performance of several zero-shot CIR baselines.

EUFCC-CIR: a Composed Image Retrieval Dataset for GLAM Collections

TL;DR

The value of the EUFCC-CIR dataset is demonstrated by highlighting its unique qualities in comparison to other existing CIR datasets and evaluating the performance of several zero-shot CIR baselines.

Abstract

The intersection of Artificial Intelligence and Digital Humanities enables researchers to explore cultural heritage collections with greater depth and scale. In this paper, we present EUFCC-CIR, a dataset designed for Composed Image Retrieval (CIR) within Galleries, Libraries, Archives, and Museums (GLAM) collections. Our dataset is built on top of the EUFCC-340K image labeling dataset and contains over 180K annotated CIR triplets. Each triplet is composed of a multi-modal query (an input image plus a short text describing the desired attribute manipulations) and a set of relevant target images. The EUFCC-CIR dataset fills an existing gap in CIR-specific resources for Digital Humanities. We demonstrate the value of the EUFCC-CIR dataset by highlighting its unique qualities in comparison to other existing CIR datasets and evaluating the performance of several zero-shot CIR baselines.
Paper Structure (7 sections, 9 figures, 2 tables)

This paper contains 7 sections, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Composed Image Retrieval (CIR) example. The user query is expressed with two modalities: an image of a silver coin and a short text ("Change silver for copper") that describes the desired modifications. These inputs are processed by the CIR model, which searches in a dataset to generate a ranking of predictions according to the visual and textual modalities.
  • Figure 2: Sample triplets from different CIR datasets.
  • Figure 3: Illustrative examples of how the EUFCC-CIR dataset triplets are ccreated, by analyzing the elements of the annotation hierarchy of each image attribute.
  • Figure 4: Attribute labels' total frequency for the different partitions of the EUFCC-CIR dataset. Zoom-in for better visualization.
  • Figure 5: Frequency of source-target attribute labels' pairs for the different partitions of the EUFCC-CIR dataset. The source_element is represented in the inner circle and the target_element in the outter circle. Zoom-in for better visualization.
  • ...and 4 more figures