Table of Contents
Fetching ...

Domain Generalization and Adaptation in Intensive Care with Anchor Regression

Malte Londschien, Manuel Burger, Gunnar Rätsch, Peter Bühlmann

TL;DR

A large-scale study of causality-inspired domain generalization on heterogeneous multi-center intensive care unit (ICU) data is presented and it is found that anchor regularization yields improvements of out-of-distribution performance, particularly for the most dissimilar target domains.

Abstract

The performance of predictive models in clinical settings often degrades when deployed in new hospitals due to distribution shifts. This paper presents a large-scale study of causality-inspired domain generalization on heterogeneous multi-center intensive care unit (ICU) data. We apply anchor regression and introduce anchor boosting, a novel, tree-based nonlinear extension, to a large dataset comprising 400,000 patients from nine distinct ICU databases. We find that anchor regularization yields improvements of out-of-distribution performance, particularly for the most dissimilar target domains. The methods appear robust to violations of theoretical assumptions, such as anchor exogeneity. Furthermore, we propose a novel conceptual framework to quantify the utility of large external data datasets. By evaluating performance as a function of available target-domain data, we identify three regimes: (i) a domain generalization regime, where only the external model should be used, (ii) a domain adaptation regime, where refitting the external model is optimal, and (iii) a data-rich regime, where external data provides no additional value.

Domain Generalization and Adaptation in Intensive Care with Anchor Regression

TL;DR

A large-scale study of causality-inspired domain generalization on heterogeneous multi-center intensive care unit (ICU) data is presented and it is found that anchor regularization yields improvements of out-of-distribution performance, particularly for the most dissimilar target domains.

Abstract

The performance of predictive models in clinical settings often degrades when deployed in new hospitals due to distribution shifts. This paper presents a large-scale study of causality-inspired domain generalization on heterogeneous multi-center intensive care unit (ICU) data. We apply anchor regression and introduce anchor boosting, a novel, tree-based nonlinear extension, to a large dataset comprising 400,000 patients from nine distinct ICU databases. We find that anchor regularization yields improvements of out-of-distribution performance, particularly for the most dissimilar target domains. The methods appear robust to violations of theoretical assumptions, such as anchor exogeneity. Furthermore, we propose a novel conceptual framework to quantify the utility of large external data datasets. By evaluating performance as a function of available target-domain data, we identify three regimes: (i) a domain generalization regime, where only the external model should be used, (ii) a domain adaptation regime, where refitting the external model is optimal, and (iii) a data-rich regime, where external data provides no additional value.

Paper Structure

This paper contains 38 sections, 11 equations, 28 figures, 2 tables.

Figures (28)

  • Figure 1: Distributions of binary and continuous outcomes.
  • Figure 2: MSE predicting log(creatinine) in 24 hours as a function of available patients from the target domains eICU (left) and PICdb (right).
  • Figure 3: Anchor boosting's OOD MSE predicting log(creatinine) as a function of $\gamma$, using one-hot-encoded dataset ID as anchor. See \ref{['subsec:gamma']} for details on the LOEO-CV model selection.
  • Figure 4: Linear anchor regression's OOD MSE predicting log(creatinine) in 24 hours as a function of $\gamma$. We add an elastic-net regularization term $\lambda \left( \eta \|\beta\|_1 + (1 - \eta) \| \beta \|_2^2 \right)$ to \ref{['eq:anchor', 'eq:anchor_general']}. Performances are colored by $\lambda = \lambda_\mathrm{max} / 10^2$ (orange), $\lambda_\mathrm{max} / 10^3$ (blue), and $\lambda_\mathrm{max} / 10^4$ (green). Lasso ($\eta=1$) is dashed, elastic net ($\eta = 0.5)$ solid, and ridge ($\eta=0$) dotted.
  • Figure 5: Differently expressed OOD performances predicting log(creatinine) in 24 hours as a function of $\gamma$. The performance on the y-axis is the number of patients from the target domain required to match nonlinear anchor boosting's (left) and linear anchor regression's (right) OOD performance.
  • ...and 23 more figures