Table of Contents
Fetching ...

Improving Forecasts of Suicide Attempts for Patients with Little Data

Genesis Hang, Annie Chen, Hope Neveux, Matthew K. Nock, Yaniv Yacoby

TL;DR

Problem: Forecasting imminent suicide attempts from EMA data is difficult due to the rarity of events and heterogeneity across patients. Approach: The authors propose Latent Similarity Gaussian Processes (LSGPs), embedding patients in a latent space to share trends among similar patients and using a sparse variational GP with inducing points and a product kernel $K_\\theta(\\widehat{X},\\widehat{X}^\\prime) = K_\\theta^x(X,X^\\prime) \\odot K_\\theta^z(Z,Z^\\prime)$. Contributions: (A) empirical demonstration of strong heterogeneity and underperformance of a single model; (B) development of a continuous latent similarity model that enables data-scarce patients to borrow information; (C) preliminary results showing competitive performance without substantial kernel design and a graph-based view of patient similarity that does not align with demographics; (D) analysis showing demographics do not explain similarity via low modularity. Significance: provides improved forecasting for patients with little data and offers a framework for understanding patient subtypes beyond discrete groupings, with potential to guide interventions.

Abstract

Ecological Momentary Assessment provides real-time data on suicidal thoughts and behaviors, but predicting suicide attempts remains challenging due to their rarity and patient heterogeneity. We show that single models fit to all patients perform poorly, while individualized models improve performance but still overfit to patients with limited data. To address this, we introduce Latent Similarity Gaussian Processes (LSGPs) to capture patient heterogeneity, enabling those with little data to leverage similar patients' trends. Preliminary results show promise: even without kernel-design, we outperform all but one baseline while offering a new understanding of patient similarity.

Improving Forecasts of Suicide Attempts for Patients with Little Data

TL;DR

Problem: Forecasting imminent suicide attempts from EMA data is difficult due to the rarity of events and heterogeneity across patients. Approach: The authors propose Latent Similarity Gaussian Processes (LSGPs), embedding patients in a latent space to share trends among similar patients and using a sparse variational GP with inducing points and a product kernel . Contributions: (A) empirical demonstration of strong heterogeneity and underperformance of a single model; (B) development of a continuous latent similarity model that enables data-scarce patients to borrow information; (C) preliminary results showing competitive performance without substantial kernel design and a graph-based view of patient similarity that does not align with demographics; (D) analysis showing demographics do not explain similarity via low modularity. Significance: provides improved forecasting for patients with little data and offers a framework for understanding patient subtypes beyond discrete groupings, with potential to guide interventions.

Abstract

Ecological Momentary Assessment provides real-time data on suicidal thoughts and behaviors, but predicting suicide attempts remains challenging due to their rarity and patient heterogeneity. We show that single models fit to all patients perform poorly, while individualized models improve performance but still overfit to patients with limited data. To address this, we introduce Latent Similarity Gaussian Processes (LSGPs) to capture patient heterogeneity, enabling those with little data to leverage similar patients' trends. Preliminary results show promise: even without kernel-design, we outperform all but one baseline while offering a new understanding of patient similarity.

Paper Structure

This paper contains 6 sections, 5 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Top Row: Idiographic models outperform their Single counterparts---except for specificity, which stayed constant, the magnitude of difference in metrics between the idiographic and single models is almost always positive. Bottom Row: The 30% patients with fewest data points consistently receive worse forecasts across most metrics and most models than the 30% of patients with most data---except for sensitivity, the magnitude of difference in metrics computed is usually negative, indicating lower performance.
  • Figure 2: Patient heterogeneity is so high, even random groupings of patients significantly boosts performance. We split the data set into $G$ groups, randomly assigning patients, fitting a separate model to each group, and measuring the performance (y-axis) as $G$ increases (x-axis). We repeated this experiment 10 times per $G$, plotting the distribution of metrics. Finally, we also compare models fit on randomly- vs. demographics-grouped patients (with the same number of $G$).
  • Figure 3: Graphs of Patient Similarity. Color of nodes represent group membership. Edges are black if connecting nodes of different groups; thickness indicates magnitude of covariance. The modularity for all graphs is close to 0, indicating balanced connections within/between groups.