Table of Contents
Fetching ...

Semi-parametric inference based on adaptively collected data

Licong Lin, Koulik Khamaru, Martin J. Wainwright

TL;DR

This work develops AdapTZ, a family of adaptive two-stage $Z$-estimators for semi-parametric models under adaptively collected data. It establishes asymptotic normality for target parameters in both partial linear models and generalized linear models by leveraging Neyman orthogonality and adaptive re-weighting to neutralize nuisance perturbations. The results hold under mild explorability conditions on data collection, including decay bounds on selection probabilities, and extend to fixed-direction inferences with weaker requirements. Numerical experiments on adaptive linear and logistic models illustrate improved confidence interval coverage over traditional methods. The framework also covers sparse high-dimensional and nonparametric nuisance settings, with several corollaries demonstrating practical pilot-estimator choices such as OLS, Lasso, and $k$-NN pilots, and addresses scenarios where selection probabilities are unknown. This provides a principled path for valid inference in sequential, bandit-like data collection contexts with complex nuisance structure.

Abstract

Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the parameter vector of a generalized linear regression model contaminated by a non-parametric nuisance component. We construct suitably weighted estimating equations that account for adaptivity in data collection, and provide conditions under which the associated estimates are asymptotically normal. Our results characterize the degree of "explorability" required for asymptotic normality to hold. For the simpler problem of estimating a linear functional, we provide similar guarantees under much weaker assumptions. We illustrate our general theory with concrete consequences for various problems, including standard linear bandits and sparse generalized bandits, and compare with other methods via simulation studies.

Semi-parametric inference based on adaptively collected data

TL;DR

This work develops AdapTZ, a family of adaptive two-stage -estimators for semi-parametric models under adaptively collected data. It establishes asymptotic normality for target parameters in both partial linear models and generalized linear models by leveraging Neyman orthogonality and adaptive re-weighting to neutralize nuisance perturbations. The results hold under mild explorability conditions on data collection, including decay bounds on selection probabilities, and extend to fixed-direction inferences with weaker requirements. Numerical experiments on adaptive linear and logistic models illustrate improved confidence interval coverage over traditional methods. The framework also covers sparse high-dimensional and nonparametric nuisance settings, with several corollaries demonstrating practical pilot-estimator choices such as OLS, Lasso, and -NN pilots, and addresses scenarios where selection probabilities are unknown. This provides a principled path for valid inference in sequential, bandit-like data collection contexts with complex nuisance structure.

Abstract

Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the parameter vector of a generalized linear regression model contaminated by a non-parametric nuisance component. We construct suitably weighted estimating equations that account for adaptivity in data collection, and provide conditions under which the associated estimates are asymptotically normal. Our results characterize the degree of "explorability" required for asymptotic normality to hold. For the simpler problem of estimating a linear functional, we provide similar guarantees under much weaker assumptions. We illustrate our general theory with concrete consequences for various problems, including standard linear bandits and sparse generalized bandits, and compare with other methods via simulation studies.
Paper Structure (64 sections, 30 theorems, 271 equations, 5 figures, 2 algorithms)

This paper contains 64 sections, 30 theorems, 271 equations, 5 figures, 2 algorithms.

Key Result

Theorem 2.1

Suppose that Assumptions assn-lin-noise, assn-lin-selection-prob and assn-lin-nuisance-est are in force. Then the estimate ${\widetilde{\theta}}$ obtained from AdapTZ-PL$\;$ (Algorithm algo:DML-linear) satisfies

Figures (5)

  • Figure 1: (a): Standardized estimation error of the $Z$-estimator \ref{['EqnEEIntro']} for the first coordinate $\theta^*_1$; shown is a histogram based on $1000$ trials. (b): Empirical coverage probability of two-sided confidence interval for $\theta^*_1$ for a simulation for with parameters $(d_{T}, d_{N}, n) = (2, 1000, 950)$. See Section \ref{['sec:exp_linear']} for details.
  • Figure 2: Average coverage and width of confidence intervals for $\theta^*_1$ over $T = 1000$ repetitions of an adaptive linear model. The error bars denote $\pm 1$ standard error. Parameters: $d_{T} = 2, d_{N} = 5, n = 500, n_1 = 125$, $C = 2$ and $t = 0.2$. (a) and (b): Coverage of level $1 - \alpha$ one-sided confidence intervals for $\theta^*_1$. (c): Width of level $1 - \alpha$ two-sided confidence intervals for $\theta^*_1$.
  • Figure 3: Average coverage and width of confidence intervals for $\theta^*_1$ over $1000$ repetitions of an adaptive linear model. The error bars are $\pm 1$ standard error. Parameters: $d_{T}=2, d_{N}=1000, n=950, n_1=475$, $C=16$ and $t=0.2$. (a) and (b): coverage of level $1 - \alpha$ one-sided confidence intervals for $\theta^*_1$. (c): width of level $1 - \alpha$ two-sided confidence intervals for $\theta^*_1$.
  • Figure 4: Average coverage and width of confidence intervals for $\theta^*_1$ over $1000$ repetitions of an adaptive logistic model. The error bars denote $\pm 1$ standard error. Parameters: $d_{T}=2, d_{N}=20, n=2000, n_1=1000$, $C=8$ and $t=0.1$. Panles (a) and (b) give coverage of level $1 - \alpha$ one-sided confidence intervals for $\theta^*_1$. Panel (c) shows the width of level $1 - \alpha$ two-sided confidence intervals for $\theta^*_1$.
  • Figure 5: Average coverage and width of confidence intervals for $\theta^*_1$ over $1000$ repetitions of an adaptive logistic model. The error bars denote $\pm 1$ standard error. Parameters: $d_{T}=2, d_{N}=1000, n=950, n_1=475$, $C=100$ and $t=0.1$. (a) and (b): coverage of level $1 - \alpha$ one-sided confidence intervals for $\theta^*_1$. (c): Width of level $1 - \alpha$ two-sided confidence intervals for $\theta^*_1$.

Theorems & Definitions (48)

  • Theorem 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Theorem 2.4
  • Corollary 1
  • Corollary 2
  • Corollary 3
  • Corollary 4
  • Lemma 1
  • Corollary 5
  • ...and 38 more