Table of Contents
Fetching ...

Nonparametric Instrumental Regression via Kernel Methods is Minimax Optimal

Dimitri Meunier, Zhu Li, Tim Christensen, Arthur Gretton

TL;DR

This work analyzes kernel NPIV for nonparametric instrumental variable estimation, extending beyond prior identifiability assumptions by targeting the minimum-norm solution and proving convergence in the strong $L_2$-norm. It introduces a subspace-size measure via a link condition that ties the projected CME subspace to the kernel regression space, showing how instrument strength affects learning efficiency. By adopting general spectral regularization in Stage 1, the authors overcome the saturation of standard Tikhonov regularization and derive minimax-optimal rates under mild conditions, including misspecification. The results provide a principled framework for kernel NPIV with stable, rate-optimal performance in both identified and unidentified settings, with clear implications for the role of instrument strength and potential data-adaptive Stage 1 methods in practice.

Abstract

We study the kernel instrumental variable algorithm of \citet{singh2019kernel}, a nonparametric two-stage least squares (2SLS) procedure which has demonstrated strong empirical performance. We provide a convergence analysis that covers both the identified and unidentified settings: when the structural function cannot be identified, we show that the kernel NPIV estimator converges to the IV solution with minimum norm. Crucially, our convergence is with respect to the strong $L_2$-norm, rather than a pseudo-norm. Additionally, we characterize the smoothness of the target function without relying on the instrument, instead leveraging a new description of the projected subspace size (this being closely related to the link condition in inverse learning literature). With the subspace size description and under standard kernel learning assumptions, we derive, for the first time, the minimax optimal learning rate for kernel NPIV in the strong $L_2$-norm. Our result demonstrates that the strength of the instrument is essential to achieve efficient learning. We also improve the original kernel NPIV algorithm by adopting a general spectral regularization in stage 1 regression. The modified regularization can overcome the saturation effect of Tikhonov regularization.

Nonparametric Instrumental Regression via Kernel Methods is Minimax Optimal

TL;DR

This work analyzes kernel NPIV for nonparametric instrumental variable estimation, extending beyond prior identifiability assumptions by targeting the minimum-norm solution and proving convergence in the strong -norm. It introduces a subspace-size measure via a link condition that ties the projected CME subspace to the kernel regression space, showing how instrument strength affects learning efficiency. By adopting general spectral regularization in Stage 1, the authors overcome the saturation of standard Tikhonov regularization and derive minimax-optimal rates under mild conditions, including misspecification. The results provide a principled framework for kernel NPIV with stable, rate-optimal performance in both identified and unidentified settings, with clear implications for the role of instrument strength and potential data-adaptive Stage 1 methods in practice.

Abstract

We study the kernel instrumental variable algorithm of \citet{singh2019kernel}, a nonparametric two-stage least squares (2SLS) procedure which has demonstrated strong empirical performance. We provide a convergence analysis that covers both the identified and unidentified settings: when the structural function cannot be identified, we show that the kernel NPIV estimator converges to the IV solution with minimum norm. Crucially, our convergence is with respect to the strong -norm, rather than a pseudo-norm. Additionally, we characterize the smoothness of the target function without relying on the instrument, instead leveraging a new description of the projected subspace size (this being closely related to the link condition in inverse learning literature). With the subspace size description and under standard kernel learning assumptions, we derive, for the first time, the minimax optimal learning rate for kernel NPIV in the strong -norm. Our result demonstrates that the strength of the instrument is essential to achieve efficient learning. We also improve the original kernel NPIV algorithm by adopting a general spectral regularization in stage 1 regression. The modified regularization can overcome the saturation effect of Tikhonov regularization.

Paper Structure

This paper contains 44 sections, 45 theorems, 200 equations.

Key Result

Theorem 1

li2024towards For every function $F\in \mathcal{G}$ there exists a unique operator $C \in S_2(\mathcal{H}_Z, \mathcal{H}_X)$ such that $F(\cdot) = C\phi_Z(\cdot) \in \mathcal{H}_X$ with $\|C\|_{S_2(\mathcal{H}_Z, \mathcal{H}_X)} = \|F\|_{\mathcal{G}}$ and vice versa. Hence $\mathcal{G} \simeq S_2(\m

Theorems & Definitions (91)

  • Remark 1: aubin2000applied, Theorem 12.6.1
  • Remark 2: General multiplicative kernel
  • Theorem 1: vRKHS isomorphism
  • Definition 1
  • Definition 2: Filter function
  • Proposition 1
  • Remark 3
  • Remark 4: Spectral Algorithm
  • Remark 5: Subspace size
  • Remark 6: Link condition in inverse problem
  • ...and 81 more