Table of Contents
Fetching ...

Explainable AI for Sentiment Analysis of Human Metapneumovirus (HMPV) Using XLNet

Md. Shahriar Hossain Apu, Md Saiful Islam, Tanjim Taharat Aurpa

TL;DR

This study tackles public sentiment toward Human Metapneumovirus (HMPV) by analyzing YouTube comments with a transformer-based approach. It develops a sentiment analysis pipeline centered on XLNet, achieving an accuracy of 93.50% and using SHAP to provide explainability for predictions. The methodology includes dataset collection of 9,758 cleaned comments, VADER-based labeling, rigorous preprocessing, and comprehensive evaluations (confusion matrix, ROC AUC, and per-class metrics). SHAP exposes the linguistic features driving decisions, enabling error analysis and greater trust in model outputs. The work demonstrates the practical value of combining state-of-the-art NLP and explainability techniques to monitor public discourse and inform health communication strategies during HMPV outbreaks.

Abstract

In 2024, the outbreak of Human Metapneumovirus (HMPV) in China, which later spread to the UK and other countries, raised significant public concern. While HMPV typically causes mild symptoms, its effects on vulnerable individuals prompted health authorities to emphasize preventive measures. This paper explores how sentiment analysis can enhance our understanding of public reactions to HMPV by analyzing social media data. We apply transformer models, particularly XLNet, achieving 93.50% accuracy in sentiment classification. Additionally, we use explainable AI (XAI) through SHAP to improve model transparency.

Explainable AI for Sentiment Analysis of Human Metapneumovirus (HMPV) Using XLNet

TL;DR

This study tackles public sentiment toward Human Metapneumovirus (HMPV) by analyzing YouTube comments with a transformer-based approach. It develops a sentiment analysis pipeline centered on XLNet, achieving an accuracy of 93.50% and using SHAP to provide explainability for predictions. The methodology includes dataset collection of 9,758 cleaned comments, VADER-based labeling, rigorous preprocessing, and comprehensive evaluations (confusion matrix, ROC AUC, and per-class metrics). SHAP exposes the linguistic features driving decisions, enabling error analysis and greater trust in model outputs. The work demonstrates the practical value of combining state-of-the-art NLP and explainability techniques to monitor public discourse and inform health communication strategies during HMPV outbreaks.

Abstract

In 2024, the outbreak of Human Metapneumovirus (HMPV) in China, which later spread to the UK and other countries, raised significant public concern. While HMPV typically causes mild symptoms, its effects on vulnerable individuals prompted health authorities to emphasize preventive measures. This paper explores how sentiment analysis can enhance our understanding of public reactions to HMPV by analyzing social media data. We apply transformer models, particularly XLNet, achieving 93.50% accuracy in sentiment classification. Additionally, we use explainable AI (XAI) through SHAP to improve model transparency.

Paper Structure

This paper contains 20 sections, 7 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Data Collection and Preprocessing.
  • Figure 2: Proposed XLNet Model
  • Figure 3: The working process of SHAP
  • Figure 4: Workflow of the proposed approach
  • Figure 5: Comparison of Accuracy, Precision, Recall, and F1 Score for different transformer models: BERT, ELECTRA, RoBERTa, ALBERT, DistilBERT, and XLNet.
  • ...and 5 more figures