Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation using GPT-4
Sachin Yadav, Tejaswi Choppa, Dominik Schlechtweg
TL;DR
This study investigates automating semantic proximity annotation with GPT-4 and GPT-3.5 by reusing human annotation guidelines and a small gold data subset to craft prompts, implemented within the PhiTag platform for online use via the OpenAI API. It compares customized prompts against automatic prompting strategies (guided by guidelines and tutorials) and evaluates each approach using a Use Pair semantic proximity task drawn from the DWUG dataset. The results show that carefully designed customized prompts can reach modest but reliable agreement with human gold data, while automatic prompting strategies perform notably worse, even with guideline refinements. The work highlights the substantial influence of prompt design and model configuration on annotation quality and points to future directions such as condensing guidelines and exploring more effective automatic prompting methods.
Abstract
This paper explores using GPT-3.5 and GPT-4 to automate the data annotation process with automatic prompting techniques. The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task. Automatic prompts are compared to customized prompts. We further implement the prompting strategies into an open-source text annotation tool, enabling easy online use via the OpenAI API. Our study reveals the crucial role of accurate prompt design and suggests that prompting GPT-4 with human-like instructions is not straightforwardly possible for the semantic proximity task. We show that small modifications to the human guidelines already improve the performance, suggesting possible ways for future research.
