Table of Contents
Fetching ...

Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text

Muskan Garg, MSVPJ Sathvik, Amrit Chadha, Shaina Raza, Sunghwan Sohn

TL;DR

This work addresses reliability in psychology-grounded extraction and classification of low self-esteem cues from user-generated text by introducing a TLC (Trigger, LoST indicators, Consequences) annotation scheme and a LoST-focused dataset built from Reddit posts. It combines expert-guided annotations with a BERT-based reliability analysis that uses attention to identify text-spans and a binary classifier to detect low self-esteem, evaluating both general PLMs (e.g., BERT, ALBERT, DeBERTa) and mental-health domain models (e.g., MentalBERT, PsychBERT, ClinicalBERT) across in-domain and out-of-distribution data. Explanations are assessed with LIME against TLC cues using ROUGE and BLEU, revealing that models aligning with LoST indicators provide more faithful explanations, though attention often diverts toward Triggers or Consequences. The study underscores the need for reliability and interpretability in mental-health NLP, delivers a new annotated corpus, and suggests training and architectural strategies to emphasize LoST cues, with practical implications for safer, more trustworthy health-support systems. Future work includes refining attention mechanisms, incorporating external knowledge graphs, and expanding datasets to improve generalization and clinician-facing trust.

Abstract

The social NLP research community witness a recent surge in the computational advancements of mental health analysis to build responsible AI models for a complex interplay between language use and self-perception. Such responsible AI models aid in quantifying the psychological concepts from user-penned texts on social media. On thinking beyond the low-level (classification) task, we advance the existing binary classification dataset, towards a higher-level task of reliability analysis through the lens of explanations, posing it as one of the safety measures. We annotate the LoST dataset to capture nuanced textual cues that suggest the presence of low self-esteem in the posts of Reddit users. We further state that the NLP models developed for determining the presence of low self-esteem, focus more on three types of textual cues: (i) Trigger: words that triggers mental disturbance, (ii) LoST indicators: text indicators emphasizing low self-esteem, and (iii) Consequences: words describing the consequences of mental disturbance. We implement existing classifiers to examine the attention mechanism in pre-trained language models (PLMs) for a domain-specific psychology-grounded task. Our findings suggest the need of shifting the focus of PLMs from Trigger and Consequences to a more comprehensive explanation, emphasizing LoST indicators while determining low self-esteem in Reddit posts.

Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text

TL;DR

This work addresses reliability in psychology-grounded extraction and classification of low self-esteem cues from user-generated text by introducing a TLC (Trigger, LoST indicators, Consequences) annotation scheme and a LoST-focused dataset built from Reddit posts. It combines expert-guided annotations with a BERT-based reliability analysis that uses attention to identify text-spans and a binary classifier to detect low self-esteem, evaluating both general PLMs (e.g., BERT, ALBERT, DeBERTa) and mental-health domain models (e.g., MentalBERT, PsychBERT, ClinicalBERT) across in-domain and out-of-distribution data. Explanations are assessed with LIME against TLC cues using ROUGE and BLEU, revealing that models aligning with LoST indicators provide more faithful explanations, though attention often diverts toward Triggers or Consequences. The study underscores the need for reliability and interpretability in mental-health NLP, delivers a new annotated corpus, and suggests training and architectural strategies to emphasize LoST cues, with practical implications for safer, more trustworthy health-support systems. Future work includes refining attention mechanisms, incorporating external knowledge graphs, and expanding datasets to improve generalization and clinician-facing trust.

Abstract

The social NLP research community witness a recent surge in the computational advancements of mental health analysis to build responsible AI models for a complex interplay between language use and self-perception. Such responsible AI models aid in quantifying the psychological concepts from user-penned texts on social media. On thinking beyond the low-level (classification) task, we advance the existing binary classification dataset, towards a higher-level task of reliability analysis through the lens of explanations, posing it as one of the safety measures. We annotate the LoST dataset to capture nuanced textual cues that suggest the presence of low self-esteem in the posts of Reddit users. We further state that the NLP models developed for determining the presence of low self-esteem, focus more on three types of textual cues: (i) Trigger: words that triggers mental disturbance, (ii) LoST indicators: text indicators emphasizing low self-esteem, and (iii) Consequences: words describing the consequences of mental disturbance. We implement existing classifiers to examine the attention mechanism in pre-trained language models (PLMs) for a domain-specific psychology-grounded task. Our findings suggest the need of shifting the focus of PLMs from Trigger and Consequences to a more comprehensive explanation, emphasizing LoST indicators while determining low self-esteem in Reddit posts.
Paper Structure (32 sections, 8 equations, 2 figures, 5 tables)

This paper contains 32 sections, 8 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: The overview of our task. We annotate the textual cues indicating the low self-esteem aspect in human-writings, emphasizing the need of focusing LoST indicators more than triggering words and final consequences.
  • Figure 2: Architecture of Reliability Analysis for Low self-esteem detection and classification in user-penned text.