Table of Contents
Fetching ...

Multi Class Depression Detection Through Tweets using Artificial Intelligence

Muhammad Osama Nusrat, Waseem Shahzad, Saad Ahmed Jamal

TL;DR

This study tackles multiclass depression-type detection from Twitter by constructing a lexicon-guided, manually annotated dataset covering Bipolar, Major, Psychotic, Atypical, and Postpartum depression. It compares traditional ML (SVM, Random Forest, Naive Bayes) and DL approaches (CNN, LSTM, and BERT), with BERT fine-tuning delivering the highest accuracy around 0.96, and Random Forest achieving ~0.947 among ML methods. A key contribution is the use of Explainable AI (LIME/SHAP) to highlight the textual evidence driving each prediction, enabling transparent type-specific reasoning. The work provides a publicly available depression-type corpus and a robust pipeline, offering practical value for mental health screening and further research while acknowledging limitations like dataset size and the need for broader type coverage and privacy considerations.

Abstract

Depression is a significant issue nowadays. As per the World Health Organization (WHO), in 2023, over 280 million individuals are grappling with depression. This is a huge number; if not taken seriously, these numbers will increase rapidly. About 4.89 billion individuals are social media users. People express their feelings and emotions on platforms like Twitter, Facebook, Reddit, Instagram, etc. These platforms contain valuable information which can be used for research purposes. Considerable research has been conducted across various social media platforms. However, certain limitations persist in these endeavors. Particularly, previous studies were only focused on detecting depression and the intensity of depression in tweets. Also, there existed inaccuracies in dataset labeling. In this research work, five types of depression (Bipolar, major, psychotic, atypical, and postpartum) were predicted using tweets from the Twitter database based on lexicon labeling. Explainable AI was used to provide reasoning by highlighting the parts of tweets that represent type of depression. Bidirectional Encoder Representations from Transformers (BERT) was used for feature extraction and training. Machine learning and deep learning methodologies were used to train the model. The BERT model presented the most promising results, achieving an overall accuracy of 0.96.

Multi Class Depression Detection Through Tweets using Artificial Intelligence

TL;DR

This study tackles multiclass depression-type detection from Twitter by constructing a lexicon-guided, manually annotated dataset covering Bipolar, Major, Psychotic, Atypical, and Postpartum depression. It compares traditional ML (SVM, Random Forest, Naive Bayes) and DL approaches (CNN, LSTM, and BERT), with BERT fine-tuning delivering the highest accuracy around 0.96, and Random Forest achieving ~0.947 among ML methods. A key contribution is the use of Explainable AI (LIME/SHAP) to highlight the textual evidence driving each prediction, enabling transparent type-specific reasoning. The work provides a publicly available depression-type corpus and a robust pipeline, offering practical value for mental health screening and further research while acknowledging limitations like dataset size and the need for broader type coverage and privacy considerations.

Abstract

Depression is a significant issue nowadays. As per the World Health Organization (WHO), in 2023, over 280 million individuals are grappling with depression. This is a huge number; if not taken seriously, these numbers will increase rapidly. About 4.89 billion individuals are social media users. People express their feelings and emotions on platforms like Twitter, Facebook, Reddit, Instagram, etc. These platforms contain valuable information which can be used for research purposes. Considerable research has been conducted across various social media platforms. However, certain limitations persist in these endeavors. Particularly, previous studies were only focused on detecting depression and the intensity of depression in tweets. Also, there existed inaccuracies in dataset labeling. In this research work, five types of depression (Bipolar, major, psychotic, atypical, and postpartum) were predicted using tweets from the Twitter database based on lexicon labeling. Explainable AI was used to provide reasoning by highlighting the parts of tweets that represent type of depression. Bidirectional Encoder Representations from Transformers (BERT) was used for feature extraction and training. Machine learning and deep learning methodologies were used to train the model. The BERT model presented the most promising results, achieving an overall accuracy of 0.96.
Paper Structure (44 sections, 14 figures, 18 tables)

This paper contains 44 sections, 14 figures, 18 tables.

Figures (14)

  • Figure 1: Dataset snippet (a)
  • Figure 2: An overview of proposed pipeline
  • Figure 3: Dataset Details
  • Figure 4: CNN training validation accuracy and training validation loss
  • Figure 5: CNN with GloVe training validation accuracy and training validation loss
  • ...and 9 more figures