Table of Contents
Fetching ...

EEVEE: An Easy Annotation Tool for Natural Language Processing

Axel Sorensen, Siyao Peng, Barbara Plank, Rob van der Goot

TL;DR

Eevee addresses the burdens of existing annotation tools by offering an easy-to-use, browser-based solution that runs offline and relies on simple tab-separated data. It supports four task types (seq, span, class, seq2seq) and configurable tasks via JSON, enabling multi-task annotation without installation. A usability study with two annotators yields SUS scores of 75.0 and 87.5, averaging 81.25, indicating high usability and willingness to reuse. The paper also demonstrates compatibility with HuggingFace Datasets and MaChAmp, and discusses privacy advantages due to local data handling.

Abstract

Annotation tools are the starting point for creating Natural Language Processing (NLP) datasets. There is a wide variety of tools available; setting up these tools is however a hindrance. We propose EEVEE, an annotation tool focused on simplicity, efficiency, and ease of use. It can run directly in the browser (no setup required) and uses tab-separated files (as opposed to character offsets or task-specific formats) for annotation. It allows for annotation of multiple tasks on a single dataset and supports four task-types: sequence labeling, span labeling, text classification and seq2seq.

EEVEE: An Easy Annotation Tool for Natural Language Processing

TL;DR

Eevee addresses the burdens of existing annotation tools by offering an easy-to-use, browser-based solution that runs offline and relies on simple tab-separated data. It supports four task types (seq, span, class, seq2seq) and configurable tasks via JSON, enabling multi-task annotation without installation. A usability study with two annotators yields SUS scores of 75.0 and 87.5, averaging 81.25, indicating high usability and willingness to reuse. The paper also demonstrates compatibility with HuggingFace Datasets and MaChAmp, and discusses privacy advantages due to local data handling.

Abstract

Annotation tools are the starting point for creating Natural Language Processing (NLP) datasets. There is a wide variety of tools available; setting up these tools is however a hindrance. We propose EEVEE, an annotation tool focused on simplicity, efficiency, and ease of use. It can run directly in the browser (no setup required) and uses tab-separated files (as opposed to character offsets or task-specific formats) for annotation. It allows for annotation of multiple tasks on a single dataset and supports four task-types: sequence labeling, span labeling, text classification and seq2seq.
Paper Structure (5 sections, 3 figures)

This paper contains 5 sections, 3 figures.

Figures (3)

  • Figure 1: A screenshot of the setup page of Eevee with multiple tasks. The user currently configures the NER task.
  • Figure 2: Annotation example with the keyboard setting.
  • Figure 3: Searching for labels with a navigation bar.