Table of Contents
Fetching ...

Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback

Riccardo Della Vecchia, Debabrota Basu

TL;DR

This work tackles endogeneity in stochastic online linear regression and bandits by developing online instrumental-variable methods. The proposed O2SLS estimator enables online two-stage least squares, delivering non-asymptotic guarantees that separate identification and prediction performance while accounting for endogeneity through a second-stage bias term γ. Building on O2SLS, the paper introduces OFUL-IV, a bandit algorithm that achieves regret comparable to exogenous settings in the just-identified case, while remaining robust to endogeneity. Empirical results on synthetic and real data corroborate the theoretical gains, illustrating the practical benefits of online IV regression in driving accurate parameter estimation and low regret under endogeneity.

Abstract

Endogeneity, i.e. the dependence of noise and covariates, is a common phenomenon in real data due to omitted variables, strategic behaviours, measurement errors etc. In contrast, the existing analyses of stochastic online linear regression with unbounded noise and linear bandits depend heavily on exogeneity, i.e. the independence of noise and covariates. Motivated by this gap, we study the over- and just-identified Instrumental Variable (IV) regression, specifically Two-Stage Least Squares, for stochastic online learning, and propose to use an online variant of Two-Stage Least Squares, namely O2SLS. We show that O2SLS achieves $\mathcal O(d_{x}d_{z}\log^2 T)$ identification and $\widetilde{\mathcal O}(γ\sqrt{d_{z} T})$ oracle regret after $T$ interactions, where $d_{x}$ and $d_{z}$ are the dimensions of covariates and IVs, and $γ$ is the bias due to endogeneity. For $γ=0$, i.e. under exogeneity, O2SLS exhibits $\mathcal O(d_{x}^2 \log^2 T)$ oracle regret, which is of the same order as that of the stochastic online ridge. Then, we leverage O2SLS as an oracle to design OFUL-IV, a stochastic linear bandit algorithm to tackle endogeneity. OFUL-IV yields $\widetilde{\mathcal O}(\sqrt{d_{x}d_{z}T})$ regret that matches the regret lower bound under exogeneity. For different datasets with endogeneity, we experimentally show efficiencies of O2SLS and OFUL-IV.

Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback

TL;DR

This work tackles endogeneity in stochastic online linear regression and bandits by developing online instrumental-variable methods. The proposed O2SLS estimator enables online two-stage least squares, delivering non-asymptotic guarantees that separate identification and prediction performance while accounting for endogeneity through a second-stage bias term γ. Building on O2SLS, the paper introduces OFUL-IV, a bandit algorithm that achieves regret comparable to exogenous settings in the just-identified case, while remaining robust to endogeneity. Empirical results on synthetic and real data corroborate the theoretical gains, illustrating the practical benefits of online IV regression in driving accurate parameter estimation and low regret under endogeneity.

Abstract

Endogeneity, i.e. the dependence of noise and covariates, is a common phenomenon in real data due to omitted variables, strategic behaviours, measurement errors etc. In contrast, the existing analyses of stochastic online linear regression with unbounded noise and linear bandits depend heavily on exogeneity, i.e. the independence of noise and covariates. Motivated by this gap, we study the over- and just-identified Instrumental Variable (IV) regression, specifically Two-Stage Least Squares, for stochastic online learning, and propose to use an online variant of Two-Stage Least Squares, namely O2SLS. We show that O2SLS achieves identification and oracle regret after interactions, where and are the dimensions of covariates and IVs, and is the bias due to endogeneity. For , i.e. under exogeneity, O2SLS exhibits oracle regret, which is of the same order as that of the stochastic online ridge. Then, we leverage O2SLS as an oracle to design OFUL-IV, a stochastic linear bandit algorithm to tackle endogeneity. OFUL-IV yields regret that matches the regret lower bound under exogeneity. For different datasets with endogeneity, we experimentally show efficiencies of O2SLS and OFUL-IV.
Paper Structure (46 sections, 25 theorems, 145 equations, 11 figures, 3 tables, 2 algorithms)

This paper contains 46 sections, 25 theorems, 145 equations, 11 figures, 3 tables, 2 algorithms.

Key Result

Lemma 3.3

Let us define the design matrix of IVs to be $\mathbf{G}_{\boldsymbol z,t} \triangleq \mathbf{Z}_t^{\top} \mathbf{Z}_t + \mathbf{G}_{\boldsymbol z,0} = \sum_{s=1}^t \boldsymbol z_s \boldsymbol z_{s}^\top+ \mathbf{G}_{\boldsymbol z,0}$ with $\mathbf{G}_{\boldsymbol z,0}=\lambda \mathbf{I}_{d_{\boldsy with probability at least $1-\delta \in (0,1)$. Here, ${\mathfrak b_{t}(\delta)} \triangleq \frac{d

Figures (11)

  • Figure 1: Relations between IVs (green), Covariates (blue), and Outcome (blue) in Example \ref{['example1']} and in general for Online Two-stage Regression. Unobserved variables are in dotted circles. Observed quantities are in solid circles.
  • Figure 2: Regrets due to one-stage OFUL (green), OFUL (blue), and OFUL-IV (orange) for different norms of the hidden parameters ($S=1, 3, 5$ from left to right). Only OFUL-IV (orange) is independent of $S$ and incurs the lowest regret across endogeneity levels $\rho$.
  • Figure 3: (Left) Regression: Identification regrets of Online Ridge (blue) and $\mathsf{O\text{2}SLS}$ (orange) over $T=5000$ steps, and for $\rho = 1, 1.5, 2$. With the increase in $\rho$, i.e. endogeneity, $\mathsf{O\text{2}SLS}$ performs better. (Right) LBE: Cumulative regrets of $\mathsf{OFUL}$ (blue) and $\mathsf{OFUL\text{-}IV}$ (orange) over $T = 5000$ steps with $\rho = 1, 1.5, 2$. $\mathsf{OFUL\text{-}IV}$ always incurs lower regrets, and improvements w.r.t. $\mathsf{OFUL}$ increases with $\rho$.
  • Figure 5: Cumulative identification regret for an online regression setting of $\mathsf{O\text{2}SLS}$, $\mathsf{Ridge}$ and $\mathsf{VAWR}$ for different endogeneity levels and covariates' dimension.
  • Figure 6: MSE $=\norm{\boldsymbol \beta_t-\boldsymbol \beta}_2$ of $\mathsf{OFUL}$ and $\mathsf{OFUL\text{-}IV}$ for different endogeneity levels and covariates' dimension.
  • ...and 6 more figures

Theorems & Definitions (61)

  • Example 1.1: Learning price-sales dynamics
  • Remark 3.1: Proper online learning
  • Remark 3.2: Hardness of exogeneity vs. endogeneity for prediction
  • Lemma 3.3: Confidence ellipsoid for the second-stage parameters
  • Theorem 3.4: Identification regret of $\mathsf{O\text{2}SLS}$
  • Theorem 3.5: Oracle regret of $\mathsf{O\text{2}SLS}$
  • Remark 4.1: Alternative protocol of LBE
  • Theorem 4.2: Regret upper bound of $\mathsf{OFUL\text{-}IV}$
  • Definition A.1: $\ell_p$-norms
  • Definition A.2: $\ell_2$-operator norm
  • ...and 51 more