Table of Contents
Fetching ...

Deconstructing Depression Stigma: Integrating AI-driven Data Collection and Analysis with Causal Knowledge Graphs

Han Meng, Renwen Zhang, Ganyi Wang, Yitian Yang, Peinuan Qin, Jungup Lee, Yi-Chieh Lee

TL;DR

This study introduces an AI-enabled pipeline that uses chatbot interviews to collect rich qualitative data on depression stigma, with AI-assisted coding to label stigma attributions and a causal knowledge graph to model interconstruct relationships. The approach achieves substantial alignment with human coding (overall κ ≈ 0.69) and outperforms baselines, while a large-scale CK G (over 13k entities and 18k+ relations) reveals 11 stigma-related constructs and novel pathways beyond classic attribution theory. Case studies and a conceptual model demonstrate both confirmation of established theories and the emergence of new causal links among beliefs, emotions, and behavioral intentions. The work highlights practical implications for real-time, personalized anti-stigma interventions and suggests a scalable framework for cross-cultural psychological datasets and theory-driven interventions in HCI and mental health research.

Abstract

Mental-illness stigma is a persistent social problem, hampering both treatment-seeking and recovery. Accordingly, there is a pressing need to understand it more clearly, but analyzing the relevant data is highly labor-intensive. Therefore, we designed a chatbot to engage participants in conversations; coded those conversations qualitatively with AI assistance; and, based on those coding results, built causal knowledge graphs to decode stigma. The results we obtained from 1,002 participants demonstrate that conversation with our chatbot can elicit rich information about people's attitudes toward depression, while our AI-assisted coding was strongly consistent with human-expert coding. Our novel approach combining large language models (LLMs) and causal knowledge graphs uncovered patterns in individual responses and illustrated the interrelationships of psychological constructs in the dataset as a whole. The paper also discusses these findings' implications for HCI researchers in developing digital interventions, decomposing human psychological constructs, and fostering inclusive attitudes.

Deconstructing Depression Stigma: Integrating AI-driven Data Collection and Analysis with Causal Knowledge Graphs

TL;DR

This study introduces an AI-enabled pipeline that uses chatbot interviews to collect rich qualitative data on depression stigma, with AI-assisted coding to label stigma attributions and a causal knowledge graph to model interconstruct relationships. The approach achieves substantial alignment with human coding (overall κ ≈ 0.69) and outperforms baselines, while a large-scale CK G (over 13k entities and 18k+ relations) reveals 11 stigma-related constructs and novel pathways beyond classic attribution theory. Case studies and a conceptual model demonstrate both confirmation of established theories and the emergence of new causal links among beliefs, emotions, and behavioral intentions. The work highlights practical implications for real-time, personalized anti-stigma interventions and suggests a scalable framework for cross-cultural psychological datasets and theory-driven interventions in HCI and mental health research.

Abstract

Mental-illness stigma is a persistent social problem, hampering both treatment-seeking and recovery. Accordingly, there is a pressing need to understand it more clearly, but analyzing the relevant data is highly labor-intensive. Therefore, we designed a chatbot to engage participants in conversations; coded those conversations qualitatively with AI assistance; and, based on those coding results, built causal knowledge graphs to decode stigma. The results we obtained from 1,002 participants demonstrate that conversation with our chatbot can elicit rich information about people's attitudes toward depression, while our AI-assisted coding was strongly consistent with human-expert coding. Our novel approach combining large language models (LLMs) and causal knowledge graphs uncovered patterns in individual responses and illustrated the interrelationships of psychological constructs in the dataset as a whole. The paper also discusses these findings' implications for HCI researchers in developing digital interventions, decomposing human psychological constructs, and fostering inclusive attitudes.

Paper Structure

This paper contains 61 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Methodology overview. In this work, we propose this approach to deconstruct depression stigma through two main phases: I. AI-assisted Data Collection and Analysis Pipeline (RQ1) and II. Causal Knowledge Graph Construction (RQ2).
  • Figure 2: Overview of the AI-assisted data collection and analysis pipeline. This pipeline encompasses three main steps: Data Collection to gather interview data using an AI-powered chatbot, Human Coding to establish expert codes and develop a codebook, and AI-assisted Coding to expand coding to larger datasets and detect and categorize stigma-related expressions.
  • Figure 3: Key steps in the construction workflow for a causal knowledge graph of depression stigma: Triple Extraction, where we extract entity-relation-entity triplets from participant messages; Ontologization, where we map entities to theoretical constructs; and Entity Resolution, where we merge semantically similar entities. These steps lay the foundation for Conceptual-model Construction, where we discover emerging themes and interrelationships between constructs (not shown in the figure).
  • Figure 4: Two-dimensional principal component analysis (PCA) projection of word embeddings from participant messages. The words shown are the most frequent from the 200 $k$-means clusters, and circle sizes represent cluster frequencies. Colored arrows indicate weighted average vectors for different attributions, and word positioning reflects semantic similarity. Highlighted words near attribution arrows represent key terms closely associated with each stigma attribution.
  • Figure 5: Heatmap showing the agreement between human-derived codes and AI-generated codes. The numbers in each cell represent the frequency of consistency, and darker colors indicate closer agreement.
  • ...and 3 more figures