AMIR: Automated MisInformation Rebuttal -- A COVID-19 Vaccination Datasets based Recommendation System
Shakshi Sharma, Anwitaman Datta, Rajesh Sharma
TL;DR
AMIR tackles scalable rebuttal of COVID-19 misinformation on Twitter by integrating two complementary strategies: repurposing relevant non-misleading tweets and surfacing fact-checked articles from FaCov. It leverages MisCovid and FaCov datasets, applies LDA for fine-grained topic modeling, augments NER to recognize vaccine entities with VAC_TYPE, and uses semantic similarity (Sentence Transformers) to match misinformation with counter-information, evaluated via $MRR$ and $MAP$. The work introduces topic-topic mapping across corpora, a three-tier recommendation scheme (Specific, Almost near, General/Broad), and shows that 15 fact-checked articles can effectively counter targeted misinformation, while remaining adaptable to additional data sources. Overall, AMIR presents a practical, modular architecture for automated rebuttals that can generalize beyond COVID-19 and to other social platforms and languages, with future work aimed at live deployment and multilingual capabilities.
Abstract
Misinformation has emerged as a major societal threat in recent years in general; specifically in the context of the COVID-19 pandemic, it has wrecked havoc, for instance, by fuelling vaccine hesitancy. Cost-effective, scalable solutions for combating misinformation are the need of the hour. This work explored how existing information obtained from social media and augmented with more curated fact checked data repositories can be harnessed to facilitate automated rebuttal of misinformation at scale. While the ideas herein can be generalized and reapplied in the broader context of misinformation mitigation using a multitude of information sources and catering to the spectrum of social media platforms, this work serves as a proof of concept, and as such, it is confined in its scope to only rebuttal of tweets, and in the specific context of misinformation regarding COVID-19. It leverages two publicly available datasets, viz. FaCov (fact-checked articles) and misleading (social media Twitter) data on COVID-19 Vaccination.
