Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter

Nuredin Ali; Charles Chuankai Zhang; Ned Mayo; Stevie Chancellor

Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter

Nuredin Ali, Charles Chuankai Zhang, Ned Mayo, Stevie Chancellor

TL;DR

The results show that depression detection models do not generalize globally, and pre-trained language models achieve the best generalization compared to Logistic Regression, though still show significant gaps in performance on depressed and non-Western users.

Abstract

Social media data has been used for detecting users with mental disorders, such as depression. Despite the global significance of cross-cultural representation and its potential impact on model performance, publicly available datasets often lack crucial metadata related to this aspect. In this work, we evaluate the generalization of benchmark datasets to build AI models on cross-cultural Twitter data. We gather a custom geo-located Twitter dataset of depressed users from seven countries as a test dataset. Our results show that depression detection models do not generalize globally. The models perform worse on Global South users compared to Global North. Pre-trained language models achieve the best generalization compared to Logistic Regression, though still show significant gaps in performance on depressed and non-Western users. We quantify our findings and provide several actionable suggestions to mitigate this issue.

Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter

TL;DR

Abstract

Paper Structure (15 sections, 2 figures, 5 tables)

This paper contains 15 sections, 2 figures, 5 tables.

Introduction
Datasets
Preprocessing
Baseline Models
Results
Global North vs. Global South
Country Level Analysis
Qualitative Error Analysis
Recommendation and Conclusion
Ethical Considerations
Limitations
Appendix
Human Verification of Authentic Mental Health Disclosures
Distribution of Tokens
Example of Genuine and Non-Genuine Disclosures

Figures (2)

Figure 1: Flow chart of the overall design of the work. This shows the training and evaluation process. n=datasets, m=models.
Figure 2: The box plot illustrates the distribution of tokens across the datasets.

Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter

TL;DR

Abstract

Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter

Authors

TL;DR

Abstract

Table of Contents

Figures (2)