Table of Contents
Fetching ...

Integrating Social Determinants of Health into Knowledge Graphs: Evaluating Prediction Bias and Fairness in Healthcare

Tianqi Shang, Weiqing He, Tianlong Chen, Ying Ding, Huanmei Wu, Kaixiong Zhou, Li Shen

TL;DR

This paper tackles integrating Social Determinants of Health (SDoH) into biomedical knowledge graphs and assesses prediction bias in a drug–disease link task. It constructs an SDoH-enriched KG from MIMIC-III, MIMIC-SBDH, and PrimeKG, and introduces a fairness formulation for graph embeddings that enforces invariance to sensitive SDoH attributes. A heterogeneous-GCN is trained for link prediction, biases with respect to different SDoH are detected, and a post-processing edge-reweighting scheme is proposed to mitigate SDoH-related bias while preserving predictive performance. The results demonstrate substantial bias reduction across multiple SDoH categories with only negligible changes in Mean Reciprocal Rank, underscoring the practicality of balancing fairness and accuracy in healthcare recommendations.

Abstract

Social determinants of health (SDoH) play a crucial role in patient health outcomes, yet their integration into biomedical knowledge graphs remains underexplored. This study addresses this gap by constructing an SDoH-enriched knowledge graph using the MIMIC-III dataset and PrimeKG. We introduce a novel fairness formulation for graph embeddings, focusing on invariance with respect to sensitive SDoH information. Via employing a heterogeneous-GCN model for drug-disease link prediction, we detect biases related to various SDoH factors. To mitigate these biases, we propose a post-processing method that strategically reweights edges connected to SDoHs, balancing their influence on graph representations. This approach represents one of the first comprehensive investigations into fairness issues within biomedical knowledge graphs incorporating SDoH. Our work not only highlights the importance of considering SDoH in medical informatics but also provides a concrete method for reducing SDoH-related biases in link prediction tasks, paving the way for more equitable healthcare recommendations. Our code is available at \url{https://github.com/hwq0726/SDoH-KG}.

Integrating Social Determinants of Health into Knowledge Graphs: Evaluating Prediction Bias and Fairness in Healthcare

TL;DR

This paper tackles integrating Social Determinants of Health (SDoH) into biomedical knowledge graphs and assesses prediction bias in a drug–disease link task. It constructs an SDoH-enriched KG from MIMIC-III, MIMIC-SBDH, and PrimeKG, and introduces a fairness formulation for graph embeddings that enforces invariance to sensitive SDoH attributes. A heterogeneous-GCN is trained for link prediction, biases with respect to different SDoH are detected, and a post-processing edge-reweighting scheme is proposed to mitigate SDoH-related bias while preserving predictive performance. The results demonstrate substantial bias reduction across multiple SDoH categories with only negligible changes in Mean Reciprocal Rank, underscoring the practicality of balancing fairness and accuracy in healthcare recommendations.

Abstract

Social determinants of health (SDoH) play a crucial role in patient health outcomes, yet their integration into biomedical knowledge graphs remains underexplored. This study addresses this gap by constructing an SDoH-enriched knowledge graph using the MIMIC-III dataset and PrimeKG. We introduce a novel fairness formulation for graph embeddings, focusing on invariance with respect to sensitive SDoH information. Via employing a heterogeneous-GCN model for drug-disease link prediction, we detect biases related to various SDoH factors. To mitigate these biases, we propose a post-processing method that strategically reweights edges connected to SDoHs, balancing their influence on graph representations. This approach represents one of the first comprehensive investigations into fairness issues within biomedical knowledge graphs incorporating SDoH. Our work not only highlights the importance of considering SDoH in medical informatics but also provides a concrete method for reducing SDoH-related biases in link prediction tasks, paving the way for more equitable healthcare recommendations. Our code is available at \url{https://github.com/hwq0726/SDoH-KG}.

Paper Structure

This paper contains 4 sections, 7 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The workflow of this study. We first construct the SDoH knowledge graph by integrating MIMIC-III, MIMIC-SBDH, and PrimeKG. Then we define the fairness notion and detect the bias with respect to different sensitive SDoH. Finally, we propose a re-weighting strategy to mitigate bias by learning a weighting parameter on the edges between SDoH and drugs.
  • Figure 2: Architecture of our knowledge graph and the number of each type of node.
  • Figure 3: The workflow for bias detection. This figure illustrates the process of detecting bias with respect to sensitive SDoH $\mathcal{T}_i = \{w_0, w_1\}$. In the figure, drug nodes B is a $\mathcal{T}_i$-free drug node, and we mask the edge with disease $\Omega$ to build the training graph. We connect B with SDoH $w_0$ and $w_1$ respectively to construct two testing graphs: $\mathcal{G}_0$ and $\mathcal{G}_1$. A GCN trained on the training graph is then used to perform inference on both testing graphs, and the resulting scores are compared to detect bias.
  • Figure 4: The training process of the de-biasing method. We first follow the steps in Bias Detection but collect a different test edge set to generate two training graphs for de-biasing: $\mathcal{G}_0^d, \mathcal{G}_1^d$. Then, with the weighted-GCN, we perform inference on both of the graphs and compute the loss. During the update phase, all the parameters in pre-trained GCN are frozen and only the weighting parameter $\hat{\mathbf{e}}$ is updated to minimize the loss.