LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

Bingru Li

LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

Bingru Li

TL;DR

LinguistAgent addresses the labor-intensive challenge of linguistic annotation in the Humanities by offering a no-code, Reflective Multi-Agent platform with an Annotator and a Reviewer to automate and benchmark metaphor identification. The approach supports Prompt Engineering, Retrieval-Augmented Generation, and Fine-tuning, enabling real-time token-level evaluation using metrics such as $P$, $R$, and $F_1$. Case studies on metaphor identification demonstrate that the Reviewer loop improves detection and provides full traceability through a debug console and downloadable per-sample reports. This framework enables scalable, transparent, and reproducible annotation workflows for researchers, with open-source code available at GitHub.

Abstract

Data annotation remains a significant bottleneck in the Humanities and Social Sciences, particularly for complex semantic tasks such as metaphor identification. While Large Language Models (LLMs) show promise, a significant gap remains between the theoretical capability of LLMs and their practical utility for researchers. This paper introduces LinguistAgent, an integrated, user-friendly platform that leverages a reflective multi-model architecture to automate linguistic annotation. The system implements a dual-agent workflow, comprising an Annotator and a Reviewer, to simulate a professional peer-review process. LinguistAgent supports comparative experiments across three paradigms: Prompt Engineering (Zero/Few-shot), Retrieval-Augmented Generation, and Fine-tuning. We demonstrate LinguistAgent's efficacy using the task of metaphor identification as an example, providing real-time token-level evaluation (Precision, Recall, and $F_1$ score) against human gold standards. The application and codes are released on https://github.com/Bingru-Li/LinguistAgent.

LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

TL;DR

, and

. Case studies on metaphor identification demonstrate that the Reviewer loop improves detection and provides full traceability through a debug console and downloadable per-sample reports. This framework enables scalable, transparent, and reproducible annotation workflows for researchers, with open-source code available at GitHub.

Abstract

score) against human gold standards. The application and codes are released on https://github.com/Bingru-Li/LinguistAgent.

Paper Structure (12 sections, 5 figures, 1 table)

This paper contains 12 sections, 5 figures, 1 table.

Introduction
The Architecture of LinguistAgent
The Annotator Agent
The Reviewer Agent
Experimental Paradigms
Evaluation
Traceability and Error Analysis
Implementation and System Observability
Frontend Architecture: Streamlit Integration
Backend Logic and Model Heterogeneity
Case Study: Metaphor Identification
Conclusion

Figures (5)

Figure 1: The architecture of LinguistAgent.
Figure 2: The user interface of LinguistAgent.
Figure 3: The reasoning of the Annotator and the critique of the Reviewer.
Figure 4: The live tagging display, the performance (Average F1 Score), and the progress bar.
Figure 5: The debug section, keeping logs of all raw model responses and error types.

LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

TL;DR

Abstract

LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

Authors

TL;DR

Abstract

Table of Contents

Figures (5)