Table of Contents
Fetching ...

BiPETE: A Bi-Positional Embedding Transformer Encoder for Risk Assessment of Alcohol and Substance Use Disorder with Electronic Health Records

Daniel S. Lee, Mayra S. Haedo-Cruz, Chen Jiang, Oshin Miranda, LiRong Wang

TL;DR

BiPETE introduces a Bi-Positional Embedding Transformer Encoder that fuses relative (RoPE) and absolute (SPE) visit-time encodings to model irregular, longitudinal EHR data for single-disease risk prediction without large-scale pretraining. Trained on All of Us depressive disorder and PTSD cohorts, BiPETE achieves strong ASUD risk discrimination (AUROC ≈ $0.965$ and AUPRC ≈ $0.93$–$0.94$) and outperforms BiGRU, LR, and BNB baselines, with ablations showing the superiority of the dual-embedding approach. Integrated Gradients attribution reveals clinically meaningful biomarkers and treatments linked to higher or lower ASUD risk, offering actionable insights for early intervention. The method demonstrates robust performance with moderate data requirements and provides interpretable cues that align with known immune, hepatic, and neurochemical pathways in mood and trauma-related disorders. Overall, BiPETE presents a practical, interpretable framework for EHR-based disease risk prediction that can operate without pretrained representations and can guide risk mitigation strategies in real-world clinical settings.

Abstract

Transformer-based deep learning models have shown promise for disease risk prediction using electronic health records(EHRs), but modeling temporal dependencies remains a key challenge due to irregular visit intervals and lack of uniform structure. We propose a Bi-Positional Embedding Transformer Encoder or BiPETE for single-disease prediction, which integrates rotary positional embeddings to encode relative visit timing and sinusoidal embeddings to preserve visit order. Without relying on large-scale pretraining, BiPETE is trained on EHR data from two mental health cohorts-depressive disorder and post-traumatic stress disorder (PTSD)-to predict the risk of alcohol and substance use disorders (ASUD). BiPETE outperforms baseline models, improving the area under the precision-recall curve (AUPRC) by 34% and 50% in the depression and PTSD cohorts, respectively. An ablation study further confirms the effectiveness of the dual positional encoding strategy. We apply the Integrated Gradients method to interpret model predictions, identifying key clinical features associated with ASUD risk and protection, such as abnormal inflammatory, hematologic, and metabolic markers, as well as specific medications and comorbidities. Overall, these key clinical features identified by the attribution methods contribute to a deeper understanding of the risk assessment process and offer valuable clues for mitigating potential risks. In summary, our study presents a practical and interpretable framework for disease risk prediction using EHR data, which can achieve strong performance.

BiPETE: A Bi-Positional Embedding Transformer Encoder for Risk Assessment of Alcohol and Substance Use Disorder with Electronic Health Records

TL;DR

BiPETE introduces a Bi-Positional Embedding Transformer Encoder that fuses relative (RoPE) and absolute (SPE) visit-time encodings to model irregular, longitudinal EHR data for single-disease risk prediction without large-scale pretraining. Trained on All of Us depressive disorder and PTSD cohorts, BiPETE achieves strong ASUD risk discrimination (AUROC ≈ and AUPRC ≈ ) and outperforms BiGRU, LR, and BNB baselines, with ablations showing the superiority of the dual-embedding approach. Integrated Gradients attribution reveals clinically meaningful biomarkers and treatments linked to higher or lower ASUD risk, offering actionable insights for early intervention. The method demonstrates robust performance with moderate data requirements and provides interpretable cues that align with known immune, hepatic, and neurochemical pathways in mood and trauma-related disorders. Overall, BiPETE presents a practical, interpretable framework for EHR-based disease risk prediction that can operate without pretrained representations and can guide risk mitigation strategies in real-world clinical settings.

Abstract

Transformer-based deep learning models have shown promise for disease risk prediction using electronic health records(EHRs), but modeling temporal dependencies remains a key challenge due to irregular visit intervals and lack of uniform structure. We propose a Bi-Positional Embedding Transformer Encoder or BiPETE for single-disease prediction, which integrates rotary positional embeddings to encode relative visit timing and sinusoidal embeddings to preserve visit order. Without relying on large-scale pretraining, BiPETE is trained on EHR data from two mental health cohorts-depressive disorder and post-traumatic stress disorder (PTSD)-to predict the risk of alcohol and substance use disorders (ASUD). BiPETE outperforms baseline models, improving the area under the precision-recall curve (AUPRC) by 34% and 50% in the depression and PTSD cohorts, respectively. An ablation study further confirms the effectiveness of the dual positional encoding strategy. We apply the Integrated Gradients method to interpret model predictions, identifying key clinical features associated with ASUD risk and protection, such as abnormal inflammatory, hematologic, and metabolic markers, as well as specific medications and comorbidities. Overall, these key clinical features identified by the attribution methods contribute to a deeper understanding of the risk assessment process and offer valuable clues for mitigating potential risks. In summary, our study presents a practical and interpretable framework for disease risk prediction using EHR data, which can achieve strong performance.

Paper Structure

This paper contains 20 sections, 3 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Pipeline Flowchart Illustrating Data Preprocessing, Model Input Construction, and BiPETE Architecture. MHD Dx refers to diagnosis of Mental Health Disorder. For ASUD and non-ASUD classes, only the EHR codes from visits occurring after MHD diagnosis are extracted. Repeated values within the Visit and Days-ago index sequences indicate that the corresponding EHR codes were recorded during the same clinical visit. In the BiPETE architecture, the K, Q, V are the key, query and value embeddings, respectively, and M refers to the number of encoder blocks.
  • Figure 2: Training and Validation Performance of Classifiers with Different Positional Encoding Configurations. Training and validation loss, accuracy and AUROC are reported across 30 training epochs to compare model learning and generalization. The error bars indicate the standard deviation across five-fold cross-validation folds.
  • Figure : Supplementary Figure 1.Test ROC and PR Curve Comparison of Classifiers with Different Positional Encoding Configurations. For receiver operating characteristic (ROC) curves, the x-axis shows the false positive rate, and the y-axis shows the true positive rate. For precision-recall curves, the x-axis shows recall, and the y-axis shows precision. Curves and metrics are computed using the mean values of the corresponding rates across the cross-validation folds.
  • Figure : Supplementary Figure 2. Attention Heads of Final Layer in BiPETE. The attention is calculated using a representative EHR code sequence of length 20. EHR codes in axes labels are replaced with the vocabulary type to preserve patient anonymity. Attention maps show tokens tend to receive higher or lower scores in groups of tokens from the same visit. Different heads focus on distinct visit-level token groups, seeking different patterns in the sequence.