Table of Contents
Fetching ...

Statistical Optimality of Divide and Conquer Kernel-based Functional Linear Regression

Jiading Liu, Lei Shi

TL;DR

This work analyzes divide-and-conquer kernel-based regularized functional linear regression in a model misspecification setting where the slope $\beta_0$ need not lie in the RKHS ${\cal H}_K$. Using an integral-operator framework with operators $L_K$ and $L_C$ and composite $T=L_K^{1/2}L_C L_K^{1/2}$, it derives sharp finite-sample upper bounds and minimax lower bounds for the excess prediction risk, revealing rate-optimal convergence under general regularity conditions $L_C^{1/2}\beta_0=T_*^{\theta}(\gamma_0)$ with $0<\theta\le 1/2$ and polynomial eigen-decay $\mu_k\asymp k^{-1/p}$. The paper also shows that in the noiseless case ($\sigma=0$), the estimators can achieve arbitrarily fast polynomial rates, highlighting adaptivity to problem complexity. The results extend existing minimax rates beyond the standard assumption $\beta_0\in {\cal H}_K$, quantify the trade-offs between partition count $m$, regularization $\lambda$, and effective dimension $\mathcal{N}(\lambda)$, and connect to broader literature on kernel ridge regression, stochastic approximation, and inverse problems. Overall, the work provides a rigorous, scalable framework for kernel-based functional regression with model misspecification, offering practical guidance for large-scale applications where $\beta_0$ may lie outside the chosen RKHS prior.

Abstract

Previous analysis of regularized functional linear regression in a reproducing kernel Hilbert space (RKHS) typically requires the target function to be contained in this kernel space. This paper studies the convergence performance of divide-and-conquer estimators in the scenario that the target function does not necessarily reside in the underlying RKHS. As a decomposition-based scalable approach, the divide-and-conquer estimators of functional linear regression can substantially reduce the algorithmic complexities in time and memory. We develop an integral operator approach to establish sharp finite sample upper bounds for prediction with divide-and-conquer estimators under various regularity conditions of explanatory variables and target function. We also prove the asymptotic optimality of the derived rates by building the mini-max lower bounds. Finally, we consider the convergence of noiseless estimators and show that the rates can be arbitrarily fast under mild conditions.

Statistical Optimality of Divide and Conquer Kernel-based Functional Linear Regression

TL;DR

This work analyzes divide-and-conquer kernel-based regularized functional linear regression in a model misspecification setting where the slope need not lie in the RKHS . Using an integral-operator framework with operators and and composite , it derives sharp finite-sample upper bounds and minimax lower bounds for the excess prediction risk, revealing rate-optimal convergence under general regularity conditions with and polynomial eigen-decay . The paper also shows that in the noiseless case (), the estimators can achieve arbitrarily fast polynomial rates, highlighting adaptivity to problem complexity. The results extend existing minimax rates beyond the standard assumption , quantify the trade-offs between partition count , regularization , and effective dimension , and connect to broader literature on kernel ridge regression, stochastic approximation, and inverse problems. Overall, the work provides a rigorous, scalable framework for kernel-based functional regression with model misspecification, offering practical guidance for large-scale applications where may lie outside the chosen RKHS prior.

Abstract

Previous analysis of regularized functional linear regression in a reproducing kernel Hilbert space (RKHS) typically requires the target function to be contained in this kernel space. This paper studies the convergence performance of divide-and-conquer estimators in the scenario that the target function does not necessarily reside in the underlying RKHS. As a decomposition-based scalable approach, the divide-and-conquer estimators of functional linear regression can substantially reduce the algorithmic complexities in time and memory. We develop an integral operator approach to establish sharp finite sample upper bounds for prediction with divide-and-conquer estimators under various regularity conditions of explanatory variables and target function. We also prove the asymptotic optimality of the derived rates by building the mini-max lower bounds. Finally, we consider the convergence of noiseless estimators and show that the rates can be arbitrarily fast under mild conditions.
Paper Structure (13 sections, 25 theorems, 212 equations)

This paper contains 13 sections, 25 theorems, 212 equations.

Key Result

Proposition 1

The estimator $\hat{\beta}_{S,\lambda}$ in (totalestimator) can be expressed as $\hat{\beta}_{S,\lambda}=L^{1/2}_K \hat{f}_{S,\lambda}$ with where $I$ denotes the identity operator on ${\cal L}^2(\mathcal{T})$, $|S|=N$ is the cardinality of $S=\{(X_i,Y_i)\}_{i=1}^N$, and $T_{\bf X}: {\cal L}^2({\cal T}) \to {\cal L}^2(\cal T)$ is an empirical operator with ${\bf X}=\{X_1,\cdots,X_N\}$ defined by

Theorems & Definitions (34)

  • Proposition 1
  • Theorem 1: mini-max convergence lower bound
  • Theorem 2: convergence upper bound I
  • Theorem 3: convergence upper bound II
  • Corollary 1
  • Theorem 4: convergence upper bound III
  • Corollary 2
  • Theorem 5: convergence upper bound IV
  • Theorem 6: convergence upper bound V
  • Lemma 1
  • ...and 24 more