LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation
Bingru Li
TL;DR
LinguistAgent addresses the labor-intensive challenge of linguistic annotation in the Humanities by offering a no-code, Reflective Multi-Agent platform with an Annotator and a Reviewer to automate and benchmark metaphor identification. The approach supports Prompt Engineering, Retrieval-Augmented Generation, and Fine-tuning, enabling real-time token-level evaluation using metrics such as $P$, $R$, and $F_1$. Case studies on metaphor identification demonstrate that the Reviewer loop improves detection and provides full traceability through a debug console and downloadable per-sample reports. This framework enables scalable, transparent, and reproducible annotation workflows for researchers, with open-source code available at GitHub.
Abstract
Data annotation remains a significant bottleneck in the Humanities and Social Sciences, particularly for complex semantic tasks such as metaphor identification. While Large Language Models (LLMs) show promise, a significant gap remains between the theoretical capability of LLMs and their practical utility for researchers. This paper introduces LinguistAgent, an integrated, user-friendly platform that leverages a reflective multi-model architecture to automate linguistic annotation. The system implements a dual-agent workflow, comprising an Annotator and a Reviewer, to simulate a professional peer-review process. LinguistAgent supports comparative experiments across three paradigms: Prompt Engineering (Zero/Few-shot), Retrieval-Augmented Generation, and Fine-tuning. We demonstrate LinguistAgent's efficacy using the task of metaphor identification as an example, providing real-time token-level evaluation (Precision, Recall, and $F_1$ score) against human gold standards. The application and codes are released on https://github.com/Bingru-Li/LinguistAgent.
