Table of Contents
Fetching ...

BrainStorm @ iREL at #SMM4H 2024: Leveraging Translation and Topical Embeddings for Annotation Detection in Tweets

Manav Chaudhary, Harshit Gupta, Vasudeva Varma

TL;DR

The paper tackles the reliability of LLM-assisted annotation in health NLP by differentiating annotations made by humans versus an LLM for COVID-19 symptoms in Latin American Spanish tweets. It introduces a bilingual pipeline that translates Spanish text to English, applies language-specific and domain-specific BERT representations, and augments these with BERTopic-derived topical embeddings to predict annotation provenance. Results show only marginal gains from topical embeddings and translation, suggesting limited effectiveness in this setup but indicating potential for multilingual, domain-adaptive approaches with further refinements. The work highlights the need for more detailed ablations and richer features to improve trustworthiness of annotations in healthcare NLP tasks.

Abstract

The proliferation of LLMs in various NLP tasks has sparked debates regarding their reliability, particularly in annotation tasks where biases and hallucinations may arise. In this shared task, we address the challenge of distinguishing annotations made by LLMs from those made by human domain experts in the context of COVID-19 symptom detection from tweets in Latin American Spanish. This paper presents BrainStorm @ iRELs approach to the SMM4H 2024 Shared Task, leveraging the inherent topical information in tweets, we propose a novel approach to identify and classify annotations, aiming to enhance the trustworthiness of annotated data.

BrainStorm @ iREL at #SMM4H 2024: Leveraging Translation and Topical Embeddings for Annotation Detection in Tweets

TL;DR

The paper tackles the reliability of LLM-assisted annotation in health NLP by differentiating annotations made by humans versus an LLM for COVID-19 symptoms in Latin American Spanish tweets. It introduces a bilingual pipeline that translates Spanish text to English, applies language-specific and domain-specific BERT representations, and augments these with BERTopic-derived topical embeddings to predict annotation provenance. Results show only marginal gains from topical embeddings and translation, suggesting limited effectiveness in this setup but indicating potential for multilingual, domain-adaptive approaches with further refinements. The work highlights the need for more detailed ablations and richer features to improve trustworthiness of annotations in healthcare NLP tasks.

Abstract

The proliferation of LLMs in various NLP tasks has sparked debates regarding their reliability, particularly in annotation tasks where biases and hallucinations may arise. In this shared task, we address the challenge of distinguishing annotations made by LLMs from those made by human domain experts in the context of COVID-19 symptom detection from tweets in Latin American Spanish. This paper presents BrainStorm @ iRELs approach to the SMM4H 2024 Shared Task, leveraging the inherent topical information in tweets, we propose a novel approach to identify and classify annotations, aiming to enhance the trustworthiness of annotated data.
Paper Structure (7 sections, 1 figure, 1 table)

This paper contains 7 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: Diagram illustrating our method. The process starts with data translation from Latin American Spanish to English. These two datasets are used to generate BERT embeddings, followed by topical embeddings using BERTopic. These two embeddings are combined to give a new feature-rich embedding to be used for training our models.