Table of Contents
Fetching ...

Let Guidelines Guide You: A Prescriptive Guideline-Centered Data Annotation Methodology

Federico Ruggeri, Eleonora Misino, Arianna Muti, Katerina Korre, Paolo Torroni, Alberto Barrón-Cedeño

TL;DR

GCAM addresses key flaws of standard annotation by tying data samples to explicit guideline fragments rather than fixed class labels, enabling transparent adherence and reuse across tasks. It formalizes a two-stage process where annotators map to guideline subsets and a separate grounding function translates guidelines to classes, allowing cross-task support and richer model evaluation. Empirical results from human annotation and ML experiments show GCAM achieves comparable annotation quality to SAM while offering deeper insight into guideline adherence and model alignment, with encoder-based models generally performing well and LLMs facing challenges in identifying appropriate guideline-groundings. The approach promises improved data quality and error analysis, with release of data and code to support reproducibility and broader adoption across annotation paradigms.

Abstract

We introduce the Guideline-Centered Annotation Methodology (GCAM), a novel data annotation methodology designed to report the annotation guidelines associated with each data sample. Our approach addresses three key limitations of the standard prescriptive annotation methodology by reducing the information loss during annotation and ensuring adherence to guidelines. Furthermore, GCAM enables the efficient reuse of annotated data across multiple tasks. We evaluate GCAM in two ways: (i) through a human annotation study and (ii) an experimental evaluation with several machine learning models. Our results highlight the advantages of GCAM from multiple perspectives, demonstrating its potential to improve annotation quality and error analysis.

Let Guidelines Guide You: A Prescriptive Guideline-Centered Data Annotation Methodology

TL;DR

GCAM addresses key flaws of standard annotation by tying data samples to explicit guideline fragments rather than fixed class labels, enabling transparent adherence and reuse across tasks. It formalizes a two-stage process where annotators map to guideline subsets and a separate grounding function translates guidelines to classes, allowing cross-task support and richer model evaluation. Empirical results from human annotation and ML experiments show GCAM achieves comparable annotation quality to SAM while offering deeper insight into guideline adherence and model alignment, with encoder-based models generally performing well and LLMs facing challenges in identifying appropriate guideline-groundings. The approach promises improved data quality and error analysis, with release of data and code to support reproducibility and broader adoption across annotation paradigms.

Abstract

We introduce the Guideline-Centered Annotation Methodology (GCAM), a novel data annotation methodology designed to report the annotation guidelines associated with each data sample. Our approach addresses three key limitations of the standard prescriptive annotation methodology by reducing the information loss during annotation and ensuring adherence to guidelines. Furthermore, GCAM enables the efficient reuse of annotated data across multiple tasks. We evaluate GCAM in two ways: (i) through a human annotation study and (ii) an experimental evaluation with several machine learning models. Our results highlight the advantages of GCAM from multiple perspectives, demonstrating its potential to improve annotation quality and error analysis.
Paper Structure (40 sections, 2 equations, 2 figures, 5 tables)

This paper contains 40 sections, 2 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: (a) Example of guideline set $\mathcal{G}$ and class set $\mathcal{C}$ for hate speech detection taken from kirk-et-al-2023-edos.. (b) In SAM, the annotator knows the mapping between $\mathcal{G}$ and $\mathcal{C}$ and maps the data sample $x$ with the class subset $\mathcal{G}_x$ (blue box). (c) In GCAM, the annotator only knows $\mathcal{G}$ and maps $x$ to $\mathcal{G}_x$ (blue box). Then, $x$ is mapped to $\mathcal{C}_x$ via the class grounding function $r$ relating $\mathcal{G}_x$ and $\mathcal{C}_x$ (green box). (d) Different class sets for the same $\mathcal{G}$. (e) In SAM, changing the class set requires new annotation; while (f) GCAM allows annotating with different class sets via their corresponding class grounding functions (green boxes) at the cost of a single human annotation stage (blue box).
  • Figure 2: (a, b) Confusion matrices on $\mathcal{C}_x$ for SAM and GCAM. Neg. and Pos. indicate negative and positive class. (c) Confusion matrix on $\mathcal{G}_x$ for GCAM. (d) Grounding error types matrix for GCAM. ✗ and ✓ stand for wrong and correct prediction. Color coding refers to Edge, Ideal, and Confounder cases.