Statistical Optimality of Divide and Conquer Kernel-based Functional Linear Regression
Jiading Liu, Lei Shi
TL;DR
This work analyzes divide-and-conquer kernel-based regularized functional linear regression in a model misspecification setting where the slope $\beta_0$ need not lie in the RKHS ${\cal H}_K$. Using an integral-operator framework with operators $L_K$ and $L_C$ and composite $T=L_K^{1/2}L_C L_K^{1/2}$, it derives sharp finite-sample upper bounds and minimax lower bounds for the excess prediction risk, revealing rate-optimal convergence under general regularity conditions $L_C^{1/2}\beta_0=T_*^{\theta}(\gamma_0)$ with $0<\theta\le 1/2$ and polynomial eigen-decay $\mu_k\asymp k^{-1/p}$. The paper also shows that in the noiseless case ($\sigma=0$), the estimators can achieve arbitrarily fast polynomial rates, highlighting adaptivity to problem complexity. The results extend existing minimax rates beyond the standard assumption $\beta_0\in {\cal H}_K$, quantify the trade-offs between partition count $m$, regularization $\lambda$, and effective dimension $\mathcal{N}(\lambda)$, and connect to broader literature on kernel ridge regression, stochastic approximation, and inverse problems. Overall, the work provides a rigorous, scalable framework for kernel-based functional regression with model misspecification, offering practical guidance for large-scale applications where $\beta_0$ may lie outside the chosen RKHS prior.
Abstract
Previous analysis of regularized functional linear regression in a reproducing kernel Hilbert space (RKHS) typically requires the target function to be contained in this kernel space. This paper studies the convergence performance of divide-and-conquer estimators in the scenario that the target function does not necessarily reside in the underlying RKHS. As a decomposition-based scalable approach, the divide-and-conquer estimators of functional linear regression can substantially reduce the algorithmic complexities in time and memory. We develop an integral operator approach to establish sharp finite sample upper bounds for prediction with divide-and-conquer estimators under various regularity conditions of explanatory variables and target function. We also prove the asymptotic optimality of the derived rates by building the mini-max lower bounds. Finally, we consider the convergence of noiseless estimators and show that the rates can be arbitrarily fast under mild conditions.
