Table of Contents
Fetching ...

Dynamic Topic Language Model on Heterogeneous Children's Mental Health Clinical Notes

Hanwen Ye, Tatiana Moreno, Adrianne Alpern, Louis Ehwerhemuepha, Annie Qu

TL;DR

A longitudinal topic model with time-invariant topics and individualized temporal dependencies on the evolving document metadata is developed that preserves the semantic meaning of discovered topics over time and incorporates heterogeneity among documents.

Abstract

Mental health diseases affect children's lives and well-beings which have received increased attention since the COVID-19 pandemic. Analyzing psychiatric clinical notes with topic models is critical to evaluating children's mental status over time. However, few topic models are built for longitudinal settings, and most existing approaches fail to capture temporal trajectories for each document. To address these challenges, we develop a dynamic topic model with consistent topics and individualized temporal dependencies on the evolving document metadata. Our model preserves the semantic meaning of discovered topics over time and incorporates heterogeneity among documents. In particular, when documents can be categorized, we propose a classifier-free approach to maximize topic heterogeneity across different document groups. We also present an efficient variational optimization procedure adapted for the multistage longitudinal setting. In this case study, we apply our method to the psychiatric clinical notes from a large tertiary pediatric hospital in Southern California and achieve a 38% increase in the overall coherence of extracted topics. Our real data analysis reveals that children tend to express more negative emotions during state shutdowns and more positive when schools reopen. Furthermore, it suggests that sexual and gender minority (SGM) children display more pronounced reactions to major COVID-19 events and a greater sensitivity to vaccine-related news than non-SGM children. This study examines children's mental health progression during the pandemic and offers clinicians valuable insights to recognize disparities in children's mental health related to their sexual and gender identities.

Dynamic Topic Language Model on Heterogeneous Children's Mental Health Clinical Notes

TL;DR

A longitudinal topic model with time-invariant topics and individualized temporal dependencies on the evolving document metadata is developed that preserves the semantic meaning of discovered topics over time and incorporates heterogeneity among documents.

Abstract

Mental health diseases affect children's lives and well-beings which have received increased attention since the COVID-19 pandemic. Analyzing psychiatric clinical notes with topic models is critical to evaluating children's mental status over time. However, few topic models are built for longitudinal settings, and most existing approaches fail to capture temporal trajectories for each document. To address these challenges, we develop a dynamic topic model with consistent topics and individualized temporal dependencies on the evolving document metadata. Our model preserves the semantic meaning of discovered topics over time and incorporates heterogeneity among documents. In particular, when documents can be categorized, we propose a classifier-free approach to maximize topic heterogeneity across different document groups. We also present an efficient variational optimization procedure adapted for the multistage longitudinal setting. In this case study, we apply our method to the psychiatric clinical notes from a large tertiary pediatric hospital in Southern California and achieve a 38% increase in the overall coherence of extracted topics. Our real data analysis reveals that children tend to express more negative emotions during state shutdowns and more positive when schools reopen. Furthermore, it suggests that sexual and gender minority (SGM) children display more pronounced reactions to major COVID-19 events and a greater sensitivity to vaccine-related news than non-SGM children. This study examines children's mental health progression during the pandemic and offers clinicians valuable insights to recognize disparities in children's mental health related to their sexual and gender identities.
Paper Structure (18 sections, 1 theorem, 9 equations, 7 figures, 2 tables, 3 algorithms)

This paper contains 18 sections, 1 theorem, 9 equations, 7 figures, 2 tables, 3 algorithms.

Key Result

Proposition 1

Under Assumptions assumption:markov- assumption: independence, the evidence lower bound (ELBO) for a single document generated by Process algo: HDTM over a finite $T$-stage longitudinal time horizon is

Figures (7)

  • Figure 1: A graphical model of the heterogeneous DTM in a balanced multistage longitudinal setting, where $\theta_0$ is a prior for the initial topic proportions. Topics $\beta$'s are held constant and provided at every time stage. Both $\theta$ and $\beta$ are the latent variables, whereas $W$, $X$ and $Y$ are observed variables.
  • Figure 2: Comparison of topic proportion distributions under two sets of latent topics: $\beta_1$, $\beta_2$ versus $\Tilde{\beta}_1$, $\Tilde{\beta}_2$. The shapes displayed on a two-dimensional plane correspond to optimal topic proportions for each latent topic specification. Solid outlines indicate proportions from the true group identity, while dashed outlines represent counterfactual proportions.
  • Figure 3: Boxplots of the dominant topic accuracy from estimated topic proportions versus the number of time stages when N=1000, K=8, where the generative prior function is non-linear. The left-to-right order of boxplot methods matches the top-to-bottom order of the legend.
  • Figure 4: Boxplots of the group-membership accuracy from estimated topic proportions versus the number of time stages when N=1000, K=8, and the generative prior function is non-linear. The left-to-right order of boxplot methods matches the top-to-bottom order of the legend.
  • Figure 5: Wordclouds of three topics extracted by HCF-DTM.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Proposition 1