Table of Contents
Fetching ...

Improving Opioid Use Disorder Risk Modelling through Behavioral and Genetic Feature Integration

Sybille Légitime, Kaustubh Prabhu, Devin McConnell, Bing Wang, Dipak K. Dey, Derek Aguiar

TL;DR

Opioid use disorder risk estimation is challenged by the absence of joint mobility and genetic datasets. The authors present an integrative framework that merges mobility features extracted from GPS and Wi‑Fi traces with genetic variants, using data augmentation and simulation to generate hybrid samples with a specified disease co-occurrence and relative risk parameters. Across multiple classifiers, the study shows that combining mobility and genetic features yields superior risk prediction performance, with mobility features often driving predictions and genetic signals contributing notably in linear models. They discuss privacy, bias, and generalization considerations, and provide data and code to support future clinical evaluation and deployment.

Abstract

Opioids are an effective analgesic for acute and chronic pain, but also carry a considerable risk of addiction leading to millions of opioid use disorder (OUD) cases and tens of thousands of premature deaths in the United States yearly. Estimating OUD risk prior to prescription could improve the efficacy of treatment regimens, monitoring programs, and intervention strategies, but risk estimation is typically based on self-reported data or questionnaires. We develop an experimental design and computational methods that combine genetic variants associated with OUD with behavioral features extracted from GPS and Wi-Fi spatiotemporal coordinates to assess OUD risk. Since both OUD mobility and genetic data do not exist for the same cohort, we develop algorithms to (1) generate mobility features from empirical distributions and (2) synthesize mobility and genetic samples assuming an expected level of disease co-occurrence. We show that integrating genetic and mobility modalities improves risk modelling using classification accuracy, area under the precision-recall and receiver operator characteristic curves, and $F_1$ score. Interpreting the fitted models suggests that mobility features have more influence on OUD risk, although the genetic contribution was significant, particularly in linear models. While there exist concerns with respect to privacy, security, bias, and generalizability that must be evaluated in clinical trials before being implemented in practice, our framework provides preliminary evidence that behavioral and genetic features may improve OUD risk estimation to assist with personalized clinical decision-making.

Improving Opioid Use Disorder Risk Modelling through Behavioral and Genetic Feature Integration

TL;DR

Opioid use disorder risk estimation is challenged by the absence of joint mobility and genetic datasets. The authors present an integrative framework that merges mobility features extracted from GPS and Wi‑Fi traces with genetic variants, using data augmentation and simulation to generate hybrid samples with a specified disease co-occurrence and relative risk parameters. Across multiple classifiers, the study shows that combining mobility and genetic features yields superior risk prediction performance, with mobility features often driving predictions and genetic signals contributing notably in linear models. They discuss privacy, bias, and generalization considerations, and provide data and code to support future clinical evaluation and deployment.

Abstract

Opioids are an effective analgesic for acute and chronic pain, but also carry a considerable risk of addiction leading to millions of opioid use disorder (OUD) cases and tens of thousands of premature deaths in the United States yearly. Estimating OUD risk prior to prescription could improve the efficacy of treatment regimens, monitoring programs, and intervention strategies, but risk estimation is typically based on self-reported data or questionnaires. We develop an experimental design and computational methods that combine genetic variants associated with OUD with behavioral features extracted from GPS and Wi-Fi spatiotemporal coordinates to assess OUD risk. Since both OUD mobility and genetic data do not exist for the same cohort, we develop algorithms to (1) generate mobility features from empirical distributions and (2) synthesize mobility and genetic samples assuming an expected level of disease co-occurrence. We show that integrating genetic and mobility modalities improves risk modelling using classification accuracy, area under the precision-recall and receiver operator characteristic curves, and score. Interpreting the fitted models suggests that mobility features have more influence on OUD risk, although the genetic contribution was significant, particularly in linear models. While there exist concerns with respect to privacy, security, bias, and generalizability that must be evaluated in clinical trials before being implemented in practice, our framework provides preliminary evidence that behavioral and genetic features may improve OUD risk estimation to assist with personalized clinical decision-making.
Paper Structure (2 sections, 1 equation, 8 figures, 7 tables, 2 algorithms)

This paper contains 2 sections, 1 equation, 8 figures, 7 tables, 2 algorithms.

Figures (8)

  • Figure 1: Overview of our integrative approach for estimating disease risk. The mobility trace and genetic data are preprocessed, then augmented to balance the genetic and mobility trace sample sizes. The augmented data is merged using a disease co-occurrence parameter ($C$), genetic relative risk ($G$), and mobility relative risk ($M$). In the modelling step, features and models are selected, and classifiers are trained to estimate OUD risk.
  • Figure 2: Mobility trace features simulation. (1) From a pre-selected population (case or control), the eCDF for a feature is generated. (2) The eCDF for that feature is inverted, and linear interpolation is performed between points. (3) The simulated feature value is created by random sampling of the inverse eCDF. The process is repeated across all features and all samples in the population.
  • Figure 3: AUPRC across co-occurrence levels. Box plots show the distribution AUPRC with Tukey whiskers (median $\pm$$1.5$$\times$ interquartile range). Each method was executed for $100$ datasets across co-occurrence $C\in \{0.6,1.0\}.$ with relative risks $(G,M)=(10,5)$.
  • Figure 4: ROC curves across relative risk configurations - random forest. Random forest models were trained on $100$ randomly sampled datasets with co-occurrence $C=0.8$ and across relative risks $(G,M) \in \{(\infty, \infty), (10,1), (10,5), (15,1), (15,5), (5,5)\}$.
  • Figure 5: ROC curves across relative risk configurations - logistic regression. Logistic regression models were trained on $100$ randomly sampled datasets across relative risks $(G,M) \in \{(\infty, \infty), (10,1), (10,5), (15,1), (15,5), (5,5)\}$.
  • ...and 3 more figures