Table of Contents
Fetching ...

Out-of-distribution robustness for multivariate analysis via causal regularisation

Homer Durand, Gherardo Varando, Nathan Mankovich, Gustau Camps-Valls

TL;DR

The paper addresses the challenge of out-of-distribution generalisation in multivariate analysis by extending Anchor Regression (AR) to a broad class of multivariate algorithms. It shows that, under a linear Anchor SCM, the worst-case loss over anchor interventions reduces to a simple linear combination of training covariances, enabling anchor-compatible losses to achieve distributional robustness. The authors provide population and sample estimators, discuss parameter selection, and identify which MVAs are anchor-compatible (e.g., MLR, OPLS, RRR, PLS) versus not (e.g., CCA). Through simulations, climate D&A, and air-quality experiments, they demonstrate that anchor-regularised MVAs improve test performance and invariance under bounded anchor perturbations, while incurring modest computational overhead. The work bridges causal inference and classical MVAs, offering practical, robust tools for domain-shift-prone scientific applications and paving the way for nonlinear (kernel) extensions.

Abstract

We propose a regularisation strategy of classical machine learning algorithms rooted in causality that ensures robustness against distribution shifts. Building upon the anchor regression framework, we demonstrate how incorporating a straightforward regularisation term into the loss function of classical multivariate analysis algorithms, such as (orthonormalized) partial least squares, reduced-rank regression, and multiple linear regression, enables out-of-distribution generalisation. Our framework allows users to efficiently verify the compatibility of a loss function with the regularisation strategy. Estimators for selected algorithms are provided, showcasing consistency and efficacy in synthetic and real-world climate science problems. The empirical validation highlights the versatility of anchor regularisation, emphasizing its compatibility with multivariate analysis approaches and its role in enhancing replicability while guarding against distribution shifts. The extended anchor framework advances causal inference methodologies, addressing the need for reliable out-of-distribution generalisation.

Out-of-distribution robustness for multivariate analysis via causal regularisation

TL;DR

The paper addresses the challenge of out-of-distribution generalisation in multivariate analysis by extending Anchor Regression (AR) to a broad class of multivariate algorithms. It shows that, under a linear Anchor SCM, the worst-case loss over anchor interventions reduces to a simple linear combination of training covariances, enabling anchor-compatible losses to achieve distributional robustness. The authors provide population and sample estimators, discuss parameter selection, and identify which MVAs are anchor-compatible (e.g., MLR, OPLS, RRR, PLS) versus not (e.g., CCA). Through simulations, climate D&A, and air-quality experiments, they demonstrate that anchor-regularised MVAs improve test performance and invariance under bounded anchor perturbations, while incurring modest computational overhead. The work bridges causal inference and classical MVAs, offering practical, robust tools for domain-shift-prone scientific applications and paving the way for nonlinear (kernel) extensions.

Abstract

We propose a regularisation strategy of classical machine learning algorithms rooted in causality that ensures robustness against distribution shifts. Building upon the anchor regression framework, we demonstrate how incorporating a straightforward regularisation term into the loss function of classical multivariate analysis algorithms, such as (orthonormalized) partial least squares, reduced-rank regression, and multiple linear regression, enables out-of-distribution generalisation. Our framework allows users to efficiently verify the compatibility of a loss function with the regularisation strategy. Estimators for selected algorithms are provided, showcasing consistency and efficacy in synthetic and real-world climate science problems. The empirical validation highlights the versatility of anchor regularisation, emphasizing its compatibility with multivariate analysis approaches and its role in enhancing replicability while guarding against distribution shifts. The extended anchor framework advances causal inference methodologies, addressing the need for reliable out-of-distribution generalisation.
Paper Structure (53 sections, 4 theorems, 43 equations, 15 figures, 6 tables, 2 algorithms)

This paper contains 53 sections, 4 theorems, 43 equations, 15 figures, 6 tables, 2 algorithms.

Key Result

Theorem 3.2

Let the distribution of $(X, Y, H)$ be entailed by the SCM Eq:anchor-model and $\mathcal{L}(X, Y; \mathbf{\Theta})$ be an anchor-compatible loss function. Then for any set of parameters $\mathbf{\Theta}$, and any causal regulariser $\gamma \in \mathbb{R}^+$, we have where $C^\gamma = \{\operatorname{\mathbb{P}_\nu}: \Sigma_\nu \preceq \gamma \Sigma_A \}$. Proof can be found in Appendix (thm:AClos

Figures (15)

  • Figure 1: Directed Acyclic Graph (DAG) analyzed in this work, as induced by the Structural Causal Model (SCM) described in \ref{['Eq:anchor-model']}. The directions of the arrows between $X$, $Y$, and $H$ are flexible, provided that the graph remains acyclic. All possible configurations of the DAG can be seen in Fig. \ref{['fig:dag_variations']} in Appendix.
  • Figure 2: Robustness to increasing perturbation strength for PA, unregularised, IV, anchor-regularised ($\gamma=5$) algorithms ( MLR, OPLS, RRR, PLS and CCA). Anchor versions show robustness in terms of $R^2$ for bounded intervention strength. Shaded areas represent the range of two standard errors of the mean for running $B=20$ times the experiment.
  • Figure 3: (Left) Standard deviation of DIV among training (red) and testing (blue) model members. (Right A) $R^2$ score differences between A-RRRR ($\gamma=5$) and RRRR for test models. (Right B) Differences in residuals-DIV correlation ($r$) between A-RRRR ($\gamma=5$) and RRRR for test models. Red and hatched areas indicate where A-RRRR performs better.
  • Figure 4: All possible DAGs compatible with the anchor framework.
  • Figure 5: Experiments in high-dimensional setting with $d=p=300$ and $n=200$. We can see that both Multi-output Ridge Regression and Reduced rank Ridge Regression are optimal for a wide range of perturbation strength. Shaded areas represent the range of two standard errors of the mean for running $B=20$ times the experiment.
  • ...and 10 more figures

Theorems & Definitions (11)

  • Definition 3.1: Anchor-compatible loss
  • Theorem 3.2
  • Proposition 4.1
  • Proposition A.1: Multilinear, Reduced Rank and Orthonormalised Partial Least Square Regression are anchor-compatible.
  • proof
  • Proposition A.2: Partial Least Square regression is anchor-compatible.
  • proof
  • Example A.3: Canonical Correlation Analysis is not anchor-compatible.
  • proof
  • proof
  • ...and 1 more