Labeling of Cultural Heritage Collections on the Intersection of Visual Analytics and Digital Humanities

Christofer Meinecke

Labeling of Cultural Heritage Collections on the Intersection of Visual Analytics and Digital Humanities

Christofer Meinecke

TL;DR

This paper examines the challenges of applying visual analytics and machine learning to cultural heritage collections, with a focus on labeling, data quality, and the underexplored realm of intangible heritage. It argues that successful interdisciplinary work relies on participatory design and human-in-the-loop labeling, and it presents three case studies—Interactive Text Edition Alignment, Visualizing Entities in Medieval Manuscripts, and Hierarchical Classification for Medieval Illuminations—to illustrate data problems and concrete design takeaways. The findings emphasize issues such as limited data, lack of ground truth, vocabulary drift across institutions, and the need for lightweight, transparent labeling workflows and domain-specific vocabularies, sometimes leveraging weak supervision. Collectively, the work offers practical guidance for visualization scholars and outlines directions to strengthen GLAM collaborations at the intersection of digital humanities and visual analytics, including multi-label approaches and strategies for incorporating intangible heritage perspectives.

Abstract

Engaging in interdisciplinary projects on the intersection between visualization and humanities research can be a challenging endeavor. Challenges can be finding valuable outcomes for both domains, or how to apply state-of-the-art visual analytics methods like supervised machine learning algorithms. We discuss these challenges when working with cultural heritage data. Further, there is a gap in applying these methods to intangible heritage. To give a reflection on some interdisciplinary projects, we present three case studies focusing on the labeling of cultural heritage collections, the problems and challenges with the data, the participatory design process, and takeaways for the visualization scholars from these collaborations.

Labeling of Cultural Heritage Collections on the Intersection of Visual Analytics and Digital Humanities

TL;DR

Abstract

Paper Structure (22 sections, 4 figures)

This paper contains 22 sections, 4 figures.

Introduction
Challenges with Cultural Heritage Data
Metadata & Annotations
Intangible Heritage
Valuable Outcome for both Communities
Case Studies
Interactive Text Edition Alignment
Setup
Data Problems
Design Process
Takeaways
Visualizing Entities in Medieval Manuscripts
Setup
Data Problems
Design Process
...and 7 more sections

Figures (4)

Figure 1: Challenges when working with Cultural Heritage data in interdisciplinary projects.
Figure 2: The human-in-the-loop process for Explaining Semi-Supervised Text Alignment through Visualizationmeinecke2021explaining
Figure 3: A page of the Marco Polo dataset with entities found by a neural network.
Figure 4: Systematic overview of the semi-automated image labeling workflow applied to the Paris Bible datasets parisbible.

Labeling of Cultural Heritage Collections on the Intersection of Visual Analytics and Digital Humanities

TL;DR

Abstract

Labeling of Cultural Heritage Collections on the Intersection of Visual Analytics and Digital Humanities

Authors

TL;DR

Abstract

Table of Contents

Figures (4)