Table of Contents
Fetching ...

HealthGAT: Node Classifications in Electronic Health Records using Graph Attention Networks

Fahmida Liza Piya, Mehak Gupta, Rahmatollah Beheshti

TL;DR

HealthGAT addresses the challenge of deriving meaningful representations from electronic health records by introducing a hierarchical graph neural network that learns service embeddings and refines visit embeddings through a graph attention network. It leverages two auxiliary pre-training tasks to predict current and future medical codes, enhancing temporal and predictive fidelity. Evaluated on the eICU dataset, HealthGAT outperforms baselines in node classification and readmission prediction, demonstrating its ability to model complex medical relationships and progression. The approach offers a scalable, interpretable framework for improving clinical decision support and EHR-based analytics.

Abstract

While electronic health records (EHRs) are widely used across various applications in healthcare, most applications use the EHRs in their raw (tabular) format. Relying on raw or simple data pre-processing can greatly limit the performance or even applicability of downstream tasks using EHRs. To address this challenge, we present HealthGAT, a novel graph attention network framework that utilizes a hierarchical approach to generate embeddings from EHR, surpassing traditional graph-based methods. Our model iteratively refines the embeddings for medical codes, resulting in improved EHR data analysis. We also introduce customized EHR-centric auxiliary pre-training tasks to leverage the rich medical knowledge embedded within the data. This approach provides a comprehensive analysis of complex medical relationships and offers significant advancement over standard data representation techniques. HealthGAT has demonstrated its effectiveness in various healthcare scenarios through comprehensive evaluations against established methodologies. Specifically, our model shows outstanding performance in node classification and downstream tasks such as predicting readmissions and diagnosis classifications. Our code is available at https://github.com/healthylaife/HealthGAT

HealthGAT: Node Classifications in Electronic Health Records using Graph Attention Networks

TL;DR

HealthGAT addresses the challenge of deriving meaningful representations from electronic health records by introducing a hierarchical graph neural network that learns service embeddings and refines visit embeddings through a graph attention network. It leverages two auxiliary pre-training tasks to predict current and future medical codes, enhancing temporal and predictive fidelity. Evaluated on the eICU dataset, HealthGAT outperforms baselines in node classification and readmission prediction, demonstrating its ability to model complex medical relationships and progression. The approach offers a scalable, interpretable framework for improving clinical decision support and EHR-based analytics.

Abstract

While electronic health records (EHRs) are widely used across various applications in healthcare, most applications use the EHRs in their raw (tabular) format. Relying on raw or simple data pre-processing can greatly limit the performance or even applicability of downstream tasks using EHRs. To address this challenge, we present HealthGAT, a novel graph attention network framework that utilizes a hierarchical approach to generate embeddings from EHR, surpassing traditional graph-based methods. Our model iteratively refines the embeddings for medical codes, resulting in improved EHR data analysis. We also introduce customized EHR-centric auxiliary pre-training tasks to leverage the rich medical knowledge embedded within the data. This approach provides a comprehensive analysis of complex medical relationships and offers significant advancement over standard data representation techniques. HealthGAT has demonstrated its effectiveness in various healthcare scenarios through comprehensive evaluations against established methodologies. Specifically, our model shows outstanding performance in node classification and downstream tasks such as predicting readmissions and diagnosis classifications. Our code is available at https://github.com/healthylaife/HealthGAT
Paper Structure (22 sections, 5 equations, 3 figures, 6 tables)

This paper contains 22 sections, 5 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Relations between different medical entities. The visual depicts three distinct entities representing doctors, patients, and medical services, including diagnoses and procedures. Doctors provide expertise and diagnoses, patients receive care, and medical services encompass various procedures and treatments. The interconnectedness of these elements illustrates the complexity and richness of EHR data, providing valuable insights for healthcare analysis and decision-making.
  • Figure 2: Service embeddings visualized in two dimensions, with each dot representing a service. The co-occurrence and interaction patterns and distribution of services are depicted in the visualization. Services that frequently occur together or share similar contexts appear closer in the embedding space, revealing underlying structures and relationships in healthcare service provision.
  • Figure 3: The step-by-step process outlines our model's creation and refinement of visit embeddings, emphasizing the key phases involved in capturing the complex temporal dynamics of patient visits. T represents the starting point of the time intervals, corresponding to the onset of patient stays. Subsequent intervals denoted as T+1440, T+2880, and so forth, are spaced 1440 minutes (24 hours) apart from one another. The parameter n signifies the number of intervals or segments that have passed since the starting point T. Therefore, T+n1440 denotes the time point that is n intervals (each spanning 1440 minutes) ahead of T.