Table of Contents
Fetching ...

Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning

Philip Amortila, Tongyi Cao, Akshay Krishnamurthy

TL;DR

This work addresses misspecification under adversarial covariate shift by showing that standard empirical risk minimization (ERM) amplifies misspecification through the density ratio, hindering robust performance. It introduces disagreement-based regression (DBR), a robust regression procedure that screens training regions via a disagreement filter and uses a minimax objective to avoid misspecification amplification, achieving $R_{ ext{test}}( ext{DBR}) = O( ablafty^2)$ with an optimal statistical rate and no dependence on $C_{ ext{∞}}$ in the asymptote. The authors extend DBR to offline and online reinforcement learning, obtaining new guarantees under $L_{ ablafty}$-misspecification and concentrability (offline) or coverability (online), and they demonstrate separations between concentration/coverage notions and Bellman-error-based structural parameters. These results provide a principled, distribution-shift-aware approach to function approximation in RL and reveal fundamental trade-offs between misspecification, coverage, and computational tractability. The work broadens the toolkit for robust learning under covariate shift and distribution shift, with practical implications for offline/online RL and beyond, while outlining directions for scalable algorithms and deeper theoretical characterizations of the identified separations.

Abstract

A pervasive phenomenon in machine learning applications is distribution shift, where training and deployment conditions for a machine learning model differ. As distribution shift typically results in a degradation in performance, much attention has been devoted to algorithmic interventions that mitigate these detrimental effects. In this paper, we study the effect of distribution shift in the presence of model misspecification, specifically focusing on $L_{\infty}$-misspecified regression and adversarial covariate shift, where the regression target remains fixed while the covariate distribution changes arbitrarily. We show that empirical risk minimization, or standard least squares regression, can result in undesirable misspecification amplification where the error due to misspecification is amplified by the density ratio between the training and testing distributions. As our main result, we develop a new algorithm -- inspired by robust optimization techniques -- that avoids this undesirable behavior, resulting in no misspecification amplification while still obtaining optimal statistical rates. As applications, we use this regression procedure to obtain new guarantees in offline and online reinforcement learning with misspecification and establish new separations between previously studied structural conditions and notions of coverage.

Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning

TL;DR

This work addresses misspecification under adversarial covariate shift by showing that standard empirical risk minimization (ERM) amplifies misspecification through the density ratio, hindering robust performance. It introduces disagreement-based regression (DBR), a robust regression procedure that screens training regions via a disagreement filter and uses a minimax objective to avoid misspecification amplification, achieving with an optimal statistical rate and no dependence on in the asymptote. The authors extend DBR to offline and online reinforcement learning, obtaining new guarantees under -misspecification and concentrability (offline) or coverability (online), and they demonstrate separations between concentration/coverage notions and Bellman-error-based structural parameters. These results provide a principled, distribution-shift-aware approach to function approximation in RL and reveal fundamental trade-offs between misspecification, coverage, and computational tractability. The work broadens the toolkit for robust learning under covariate shift and distribution shift, with practical implications for offline/online RL and beyond, while outlining directions for scalable algorithms and deeper theoretical characterizations of the identified separations.

Abstract

A pervasive phenomenon in machine learning applications is distribution shift, where training and deployment conditions for a machine learning model differ. As distribution shift typically results in a degradation in performance, much attention has been devoted to algorithmic interventions that mitigate these detrimental effects. In this paper, we study the effect of distribution shift in the presence of model misspecification, specifically focusing on -misspecified regression and adversarial covariate shift, where the regression target remains fixed while the covariate distribution changes arbitrarily. We show that empirical risk minimization, or standard least squares regression, can result in undesirable misspecification amplification where the error due to misspecification is amplified by the density ratio between the training and testing distributions. As our main result, we develop a new algorithm -- inspired by robust optimization techniques -- that avoids this undesirable behavior, resulting in no misspecification amplification while still obtaining optimal statistical rates. As applications, we use this regression procedure to obtain new guarantees in offline and online reinforcement learning with misspecification and establish new separations between previously studied structural conditions and notions of coverage.
Paper Structure (33 sections, 19 theorems, 79 equations, 1 figure)

This paper contains 33 sections, 19 theorems, 79 equations, 1 figure.

Key Result

proposition 1

For any $\delta \in (0,1)$ with probability at least $1-\delta$, ERM satisfies

Figures (1)

  • Figure 1: The construction used to prove prop:erm_lb. $f_{\mathrm{bad}}$ and $\bar{f}$ have equal risk under $\mathcal{D}_{\mathrm{train}}$ but $f_{\mathrm{bad}}$ concentrates errors onto $\mathcal{D}_{\mathrm{test}}$.

Theorems & Definitions (29)

  • proposition 1: upper bound
  • proposition 2: lower bound
  • theorem 1: Main result for DBR
  • corollary 1: Covariate shift for
  • corollary 2: Well-specified case
  • lemma 1: Non-negativity
  • lemma 2: Concentration
  • theorem 2: for offline RL
  • theorem 3: for online RL
  • proposition 2: upper bound
  • ...and 19 more