Table of Contents
Fetching ...

Prompt-based mental health screening from social media text

Wesley Ramos dos Santos, Ivandre Paraboni

TL;DR

The paper tackles scalable mental health screening from large, noisy social media streams by proposing Prompt.Bow, a two-stage approach that uses GPT-3.5 prompting to filter posts by potential mental health relevance and then applies a lightweight bag-of-words classifier, trained on T5-labeled data, to predict user-level depression labels. The method constructs multiple BoW feature spaces from different relevance strata and augments them with a bigram sequence model before classifying with logistic regression. Results on the SetembroBR corpus show competitive performance (Precision 0.64, Recall 0.72, F1 0.66) compared to the state-of-the-art BERT mixture of experts while significantly reducing downstream training costs. This approach demonstrates a practical, cost-efficient path for applying prompt-based NLP to large-scale mental health screening and suggests avenues for incorporating explicit clinical definitions and linguistic indicators in future work.

Abstract

This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. Our method uses GPT 3.5. prompting to distinguish publications that may be more relevant to the task, and then uses a straightforward bag-of-words text classifier to predict actual user labels. Results are found to be on pair with a BERT mixture of experts classifier, and incurring only a fraction of its training costs.

Prompt-based mental health screening from social media text

TL;DR

The paper tackles scalable mental health screening from large, noisy social media streams by proposing Prompt.Bow, a two-stage approach that uses GPT-3.5 prompting to filter posts by potential mental health relevance and then applies a lightweight bag-of-words classifier, trained on T5-labeled data, to predict user-level depression labels. The method constructs multiple BoW feature spaces from different relevance strata and augments them with a bigram sequence model before classifying with logistic regression. Results on the SetembroBR corpus show competitive performance (Precision 0.64, Recall 0.72, F1 0.66) compared to the state-of-the-art BERT mixture of experts while significantly reducing downstream training costs. This approach demonstrates a practical, cost-efficient path for applying prompt-based NLP to large-scale mental health screening and suggests avenues for incorporating explicit clinical definitions and linguistic indicators in future work.

Abstract

This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. Our method uses GPT 3.5. prompting to distinguish publications that may be more relevant to the task, and then uses a straightforward bag-of-words text classifier to predict actual user labels. Results are found to be on pair with a BERT mixture of experts classifier, and incurring only a fraction of its training costs.
Paper Structure (8 sections, 1 figure, 4 tables)

This paper contains 8 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Prompt instruction to GPT.