Table of Contents
Fetching ...

ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

Krzysztof Kacprzyk, Samuel Holt, Jeroen Berrevoets, Zhaozhi Qian, Mihaela van der Schaar

TL;DR

The authors address the problem of inferring longitudinal heterogeneous treatment effects with interpretable dynamics by reframing TE inference as an ODE discovery problem. They introduce INSITE, a framework that first learns a population-level ODE for trajectories under different treatments and then personalizes it by fine-tuning subject-specific parameters, enabling individualized dynamic treatment effects. The approach yields interpretable, closed-form equations that accommodate irregular sampling and diverse treatment types, while introducing new identification assumptions distinct from neural-network-based TE methods. Empirical results on synthetic PKPD benchmarks show INSITE achieving superior predictive accuracy and robustness, highlighting the value of combining equation discovery with causal inference. This work opens avenues for more interpretable, data-efficient, and transferable treatment-effect models with potential impact in precision medicine and beyond.

Abstract

Inferring unbiased treatment effects has received widespread attention in the machine learning community. In recent years, our community has proposed numerous solutions in standard settings, high-dimensional treatment settings, and even longitudinal settings. While very diverse, the solution has mostly relied on neural networks for inference and simultaneous correction of assignment bias. New approaches typically build on top of previous approaches by proposing new (or refined) architectures and learning algorithms. However, the end result -- a neural-network-based inference machine -- remains unchallenged. In this paper, we introduce a different type of solution in the longitudinal setting: a closed-form ordinary differential equation (ODE). While we still rely on continuous optimization to learn an ODE, the resulting inference machine is no longer a neural network. Doing so yields several advantages such as interpretability, irregular sampling, and a different set of identification assumptions. Above all, we consider the introduction of a completely new type of solution to be our most important contribution as it may spark entirely new innovations in treatment effects in general. We facilitate this by formulating our contribution as a framework that can transform any ODE discovery method into a treatment effects method.

ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

TL;DR

The authors address the problem of inferring longitudinal heterogeneous treatment effects with interpretable dynamics by reframing TE inference as an ODE discovery problem. They introduce INSITE, a framework that first learns a population-level ODE for trajectories under different treatments and then personalizes it by fine-tuning subject-specific parameters, enabling individualized dynamic treatment effects. The approach yields interpretable, closed-form equations that accommodate irregular sampling and diverse treatment types, while introducing new identification assumptions distinct from neural-network-based TE methods. Empirical results on synthetic PKPD benchmarks show INSITE achieving superior predictive accuracy and robustness, highlighting the value of combining equation discovery with causal inference. This work opens avenues for more interpretable, data-efficient, and transferable treatment-effect models with potential impact in precision medicine and beyond.

Abstract

Inferring unbiased treatment effects has received widespread attention in the machine learning community. In recent years, our community has proposed numerous solutions in standard settings, high-dimensional treatment settings, and even longitudinal settings. While very diverse, the solution has mostly relied on neural networks for inference and simultaneous correction of assignment bias. New approaches typically build on top of previous approaches by proposing new (or refined) architectures and learning algorithms. However, the end result -- a neural-network-based inference machine -- remains unchallenged. In this paper, we introduce a different type of solution in the longitudinal setting: a closed-form ordinary differential equation (ODE). While we still rely on continuous optimization to learn an ODE, the resulting inference machine is no longer a neural network. Doing so yields several advantages such as interpretability, irregular sampling, and a different set of identification assumptions. Above all, we consider the introduction of a completely new type of solution to be our most important contribution as it may spark entirely new innovations in treatment effects in general. We facilitate this by formulating our contribution as a framework that can transform any ODE discovery method into a treatment effects method.
Paper Structure (39 sections, 16 equations, 5 figures, 31 tables)

This paper contains 39 sections, 16 equations, 5 figures, 31 tables.

Figures (5)

  • Figure 1: Conceptual overview.Left: A longitudinal treatment effects dataset with biased samples. Middle: A standard treatment effects (TE) approach will learn a representation of the data, $\mathcal{D}$, and will use the representation for inference. Right: Our approach, which learns an ODE, refined for each specific patient in the dataset.
  • Figure 2: Dimensions of our framework. The x-axis shows different treatment types (cfr. \ref{['sec:connect:2']}) and the y-axis shows between-subject variability in increasing difficulty (cfr. \ref{['sec:connect:3']}). ✓ indicates "no adapting needed", and shows our framework. In green shows what ODE discovery methods can do out of the box. Blue shows INSITE's possibilities, encompassing all settings, including complex BSV.
  • Figure 3: (a) Counterfactual $\tau$-step ahead prediction error ($\gamma=2$), on \ref{['eq:one-compartment-pkpd']}.D from \ref{['table:main_table_results']}. (b) Counterfactual $6$-step ahead prediction error (for increasing time-dependent confounding, $\gamma$), on the standard Cancer PKPD dataset. INSITE maintains a low normalized RMSE (high performance) across long time horizons and increasing time-dependent confounding. Further results are in \ref{['app:experiments']}.
  • Figure 4: Kernel distribution estimation plot of the individualized numeric constants $\beta_0$ for each patient, for a further synthetic equation that extends \ref{['eq:one-compartment-pkpd']}, to have an offset term $\beta_0$ that is sampled from a bimodal Gaussian mixture distribution. We observe that the distribution of the numeric constants $\beta_0$ for each patient is bimodal and that the two modes are the two modes of the underlying bimodal distribution. This verifies that INSITE can recover a probabilistic interpretation of the underlying data-generating process, with a population differential equation where each numeric constants can be represented by a distribution.
  • Figure : INSITE Ablation