Table of Contents
Fetching ...

MEGAnno+: A Human-LLM Collaborative Annotation System

Hannah Kim, Kushan Mitra, Rafael Li Chen, Sajjadur Rahman, Dan Zhang

TL;DR

The paper addresses the challenge of obtaining high quality labeled data at scale by combining the efficiency of large language models with human verification. It introduces MEGAnno+, a human-LLM collaborative annotation system that integrates a reusable agent and prompt management framework, an LLM annotation pipeline, and an in-notebook verification widget to selectively validate labels. The main contributions include a novel data model with Agents, Jobs, and Verification, robust end-to-end LLM annotation with error handling and metadata capture, and an exploratory verification workflow that enables targeted human review. The approach aims to deliver reliable labeled data for domain-specific or privacy-sensitive tasks, while enabling reuse of configurations and facilitating comparisons across LLMs and prompts. The work highlights practical implications for deploying LLM annotators and discusses design choices, limitations, and avenues for future expansion.

Abstract

Large language models (LLMs) can label data faster and cheaper than humans for various NLP tasks. Despite their prowess, LLMs may fall short in understanding of complex, sociocultural, or domain-specific context, potentially leading to incorrect annotations. Therefore, we advocate a collaborative approach where humans and LLMs work together to produce reliable and high-quality labels. We present MEGAnno+, a human-LLM collaborative annotation system that offers effective LLM agent and annotation management, convenient and robust LLM annotation, and exploratory verification of LLM labels by humans.

MEGAnno+: A Human-LLM Collaborative Annotation System

TL;DR

The paper addresses the challenge of obtaining high quality labeled data at scale by combining the efficiency of large language models with human verification. It introduces MEGAnno+, a human-LLM collaborative annotation system that integrates a reusable agent and prompt management framework, an LLM annotation pipeline, and an in-notebook verification widget to selectively validate labels. The main contributions include a novel data model with Agents, Jobs, and Verification, robust end-to-end LLM annotation with error handling and metadata capture, and an exploratory verification workflow that enables targeted human review. The approach aims to deliver reliable labeled data for domain-specific or privacy-sensitive tasks, while enabling reuse of configurations and facilitating comparisons across LLMs and prompts. The work highlights practical implications for deploying LLM annotators and discusses design choices, limitations, and avenues for future expansion.

Abstract

Large language models (LLMs) can label data faster and cheaper than humans for various NLP tasks. Despite their prowess, LLMs may fall short in understanding of complex, sociocultural, or domain-specific context, potentially leading to incorrect annotations. Therefore, we advocate a collaborative approach where humans and LLMs work together to produce reliable and high-quality labels. We present MEGAnno+, a human-LLM collaborative annotation system that offers effective LLM agent and annotation management, convenient and robust LLM annotation, and exploratory verification of LLM labels by humans.
Paper Structure (31 sections, 5 figures)

This paper contains 31 sections, 5 figures.

Figures (5)

  • Figure 1: MEGAnno+ system architecture and LLM-integrated workflow. With MEGAnno+ client, users can interact with the back-end service that consists of web and database servers through programmatic interfaces and UI widgets. The middle notebook shows our workflow where cell [2] is LLM annotation and cell [3] is human verification.
  • Figure 2: UI for customizing a prompt template and previewing generated prompts. Prompt is generated based on the name and options of label schema.
  • Figure 3: Example LLM responses and extraction results. Minor violations are processed as valid labels.
  • Figure 4: Annotation progress and summary.
  • Figure 5: The table view in verification UI. Users can explore LLM annotations via filtering by labels, sorting by confidence scores, or keyword search on text input.