Table of Contents
Fetching ...

KG-TREAT: Pre-training for Treatment Effect Estimation by Synergizing Patient Data with Knowledge Graphs

Ruoqi Liu, Lingfei Wu, Ping Zhang

TL;DR

Evaluation on four downstream TEE tasks shows KG-TREAT's superiority over existing methods, with an average improvement of 7% in Area under the ROC Curve (AUC) and 9% in Influence Function-based Precision of Estimating Heterogeneous Effects (IF-PEHE).

Abstract

Treatment effect estimation (TEE) is the task of determining the impact of various treatments on patient outcomes. Current TEE methods fall short due to reliance on limited labeled data and challenges posed by sparse and high-dimensional observational patient data. To address the challenges, we introduce a novel pre-training and fine-tuning framework, KG-TREAT, which synergizes large-scale observational patient data with biomedical knowledge graphs (KGs) to enhance TEE. Unlike previous approaches, KG-TREAT constructs dual-focus KGs and integrates a deep bi-level attention synergy method for in-depth information fusion, enabling distinct encoding of treatment-covariate and outcome-covariate relationships. KG-TREAT also incorporates two pre-training tasks to ensure a thorough grounding and contextualization of patient data and KGs. Evaluation on four downstream TEE tasks shows KG-TREAT's superiority over existing methods, with an average improvement of 7% in Area under the ROC Curve (AUC) and 9% in Influence Function-based Precision of Estimating Heterogeneous Effects (IF-PEHE). The effectiveness of our estimated treatment effects is further affirmed by alignment with established randomized clinical trial findings.

KG-TREAT: Pre-training for Treatment Effect Estimation by Synergizing Patient Data with Knowledge Graphs

TL;DR

Evaluation on four downstream TEE tasks shows KG-TREAT's superiority over existing methods, with an average improvement of 7% in Area under the ROC Curve (AUC) and 9% in Influence Function-based Precision of Estimating Heterogeneous Effects (IF-PEHE).

Abstract

Treatment effect estimation (TEE) is the task of determining the impact of various treatments on patient outcomes. Current TEE methods fall short due to reliance on limited labeled data and challenges posed by sparse and high-dimensional observational patient data. To address the challenges, we introduce a novel pre-training and fine-tuning framework, KG-TREAT, which synergizes large-scale observational patient data with biomedical knowledge graphs (KGs) to enhance TEE. Unlike previous approaches, KG-TREAT constructs dual-focus KGs and integrates a deep bi-level attention synergy method for in-depth information fusion, enabling distinct encoding of treatment-covariate and outcome-covariate relationships. KG-TREAT also incorporates two pre-training tasks to ensure a thorough grounding and contextualization of patient data and KGs. Evaluation on four downstream TEE tasks shows KG-TREAT's superiority over existing methods, with an average improvement of 7% in Area under the ROC Curve (AUC) and 9% in Influence Function-based Precision of Estimating Heterogeneous Effects (IF-PEHE). The effectiveness of our estimated treatment effects is further affirmed by alignment with established randomized clinical trial findings.
Paper Structure (26 sections, 16 equations, 5 figures, 9 tables, 1 algorithm)

This paper contains 26 sections, 16 equations, 5 figures, 9 tables, 1 algorithm.

Figures (5)

  • Figure 1: A detailed illustration of KG-TREAT. (a) Dual-focus PKGs are constructed by extracting relevant treatment-covariate and outcome-covariate information for each individual patient from KG. (b) The model is pre-trained by synergizing patient data with corresponding PKGs through the proposed deep bi-level attention synergy method. Two pre-training tasks are unified to learn contextualized representations. (c) The pre-trained model is fine-tuned on downstream data for TEE.
  • Figure 2: Visualization of the graph attention weights for (a) treatment-covariate PKG and (b) outcome-covariate PKG. The patient is from "Apixaban v.s. Warfarin" dataset. Higher attention weights are denoted as thicker and darker edges in the graph. The extra 2-hop bridge nodes are in gray color to distinguish them from the initial set of nodes in yellow color.
  • Figure A1: Illustration of the downstream data construction for treatment and control patient groups. The index date refers to the first prescription of either the treated or control medication, which should not precede the disease (CAD) initiation date. The baseline period before the index date is set to be no less than one year and the follow-up period after the index date is also set to one year.
  • Figure A2: The data flow for RCT extraction. The downstream tasks are constructed based on the extracted RCTs.
  • Figure A3: Performance of low-resource fine-tuning data on Rivaroxaban v.s. Aspirin and Ticagrelor v.s. Aspirin datasets with different fractions (%) of the training set used (x-axes).