Examining Temporalities on Stance Detection towards COVID-19 Vaccination
Yida Mu, Mali Jin, Kalina Bontcheva, Xingyi Song
TL;DR
The paper investigates how temporal dynamics influence stance detection toward COVID-19 vaccination on Twitter by comparing chronological versus random data splits across multiple transformer models and multilingual datasets. It finds that chronological splits markedly reduce accuracy, indicating temporal concept drift, while domain-adapted PLMs partially mitigate this drop. Through text similarity and topic-drift analyses using IoU/DICE metrics and BERTopic, the work shows that changes in vocabulary and topic distributions over time contribute to performance degradation. The findings emphasize the need for temporally aware modeling and potential ensemble or adaptation strategies to improve real-world stance detection in evolving public discourse.
Abstract
Previous studies have highlighted the importance of vaccination as an effective strategy to control the transmission of the COVID-19 virus. It is crucial for policymakers to have a comprehensive understanding of the public's stance towards vaccination on a large scale. However, attitudes towards COVID-19 vaccination, such as pro-vaccine or vaccine hesitancy, have evolved over time on social media. Thus, it is necessary to account for possible temporal shifts when analysing these stances. This study aims to examine the impact of temporal concept drift on stance detection towards COVID-19 vaccination on Twitter. To this end, we evaluate a range of transformer-based models using chronological (splitting the training, validation, and test sets in order of time) and random splits (randomly splitting these three sets) of social media data. Our findings reveal significant discrepancies in model performance between random and chronological splits in several existing COVID-19-related datasets; specifically, chronological splits significantly reduce the accuracy of stance classification. Therefore, real-world stance detection approaches need to be further refined to incorporate temporal factors as a key consideration.
