Blessings and Curses of Covariate Shifts: Adversarial Learning Dynamics, Directional Convergence, and Equilibria

Tengyuan Liang

Blessings and Curses of Covariate Shifts: Adversarial Learning Dynamics, Directional Convergence, and Equilibria

Tengyuan Liang

TL;DR

This paper analyzes how covariate distribution shifts, when modeled adversarially via Wasserstein perturbations, affect learning in an infinite-dimensional linear setting. It reveals a dichotomy: in regression, adversarial shifts steer the covariates exponentially toward an optimal experimental design, enabling rapid subsequent learning toward the Bayes predictor $f^\top_{Bayes}$; in classification, shifts drive covariates toward a hardest design at a subquadratic rate, trapping learning away from the Bayes predictor. The results are derived through a sequential game framework and a detailed analysis of Wasserstein gradient flows, supported by numerical illustrations. The findings have implications for robust design and experimental planning under covariate shifts, and point to interesting future directions for iterative dynamics and nonlinear models.

Abstract

Covariate distribution shifts and adversarial perturbations present robustness challenges to the conventional statistical learning framework: mild shifts in the test covariate distribution can significantly affect the performance of the statistical model learned based on the training distribution. The model performance typically deteriorates when extrapolation happens: namely, covariates shift to a region where the training distribution is scarce, and naturally, the learned model has little information. For robustness and regularization considerations, adversarial perturbation techniques are proposed as a remedy; however, careful study needs to be carried out about what extrapolation region adversarial covariate shift will focus on, given a learned model. This paper precisely characterizes the extrapolation region, examining both regression and classification in an infinite-dimensional setting. We study the implications of adversarial covariate shifts to subsequent learning of the equilibrium -- the Bayes optimal model -- in a sequential game framework. We exploit the dynamics of the adversarial learning game and reveal the curious effects of the covariate shift to equilibrium learning and experimental design. In particular, we establish two directional convergence results that exhibit distinctive phenomena: (1) a blessing in regression, the adversarial covariate shifts in an exponential rate to an optimal experimental design for rapid subsequent learning; (2) a curse in classification, the adversarial covariate shifts in a subquadratic rate to the hardest experimental design trapping subsequent learning.

Blessings and Curses of Covariate Shifts: Adversarial Learning Dynamics, Directional Convergence, and Equilibria

TL;DR

; in classification, shifts drive covariates toward a hardest design at a subquadratic rate, trapping learning away from the Bayes predictor. The results are derived through a sequential game framework and a detailed analysis of Wasserstein gradient flows, supported by numerical illustrations. The findings have implications for robust design and experimental planning under covariate shifts, and point to interesting future directions for iterative dynamics and nonlinear models.

Abstract

Paper Structure (27 sections, 8 theorems, 87 equations, 2 figures)

This paper contains 27 sections, 8 theorems, 87 equations, 2 figures.

Introduction
Background and Literature Review
Models, covariate distributions, and Bayes optimality
Adversarial perturbation
Wasserstein gradient flow
Problem Setup
Game and equilibria
Best response and information sets
Adversarial dynamics
Main Results
Adversarial Covariate Shifts: Blessings and Curses
Impact on the Learner: Sequential Game Perspective
Numerical Illustration
Experiment setup
Directional convergence: regression
...and 12 more sections

Key Result

Theorem 1

Consider the regression setting where $\ell(y', y) = (y'- y)^2$ and $\mathbf{y}|\mathbf{x} = x \sim \mathrm{Gaussian}(\langle x, \theta^\star\rangle, 1 )$. Let $x_0 \in \mathop{\mathrm{supp}}\nolimits(\mu^{(0)})$ that satisfies $\langle x_0, \theta^\star - \theta^{(0)} \rangle \neq 0$. Then the ind Moreover, the directional convergence is exponential in $T$, where $c = 2\log(1+2\gamma \| \theta^

Figures (2)

Figure 1: Regression setting, directional convergence. From left to right, top to bottom, we plot the directional information at timestamp $t=0, 5, 10, \ldots, 40$, once every $5$ iterations.
Figure 2: Classification setting, directional convergence. From left to right, top to bottom, we plot the directional information at timestamp $t=0, 25, 50, \ldots, 200$, once every $25$ iterations.

Theorems & Definitions (12)

Theorem 1: Regression: directional convergence
Remark 2
Remark 3
Theorem 4: Classification: directional convergence
Remark 5
Theorem 6: Regression: blessing to the learner
Theorem 7: Classification: curse to the learner
Remark 8
Lemma 9: Nonlinear recursions
Lemma 10
...and 2 more

Blessings and Curses of Covariate Shifts: Adversarial Learning Dynamics, Directional Convergence, and Equilibria

TL;DR

Abstract

Blessings and Curses of Covariate Shifts: Adversarial Learning Dynamics, Directional Convergence, and Equilibria

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (12)