Adaptive functional principal components analysis
Sunny G. W. Wang, Valentin Patilea, Nicolas Klutchnikoff
TL;DR
This work introduces an adaptive functional principal components analysis framework that exploits replication to estimate local path regularity and selects per-eigen-element smoothing bandwidths via sharp risk bounds. By formulating a first smooth, then estimate pipeline with a diagonal bias-corrected covariance estimator, the method derives explicit, data-driven bandwidth rules that adapt to the local Hölder structure $H_t$ and $L_t$ and to the design regime. Theoretical results establish risk bounds and convergence rates for eigenvalues and eigenfunctions, with feasible plug-in bounds shown to preserve these rates. Numerical experiments, including a general purpose simulator and a real electricity consumption dataset, demonstrate substantial gains in accuracy and computational efficiency over existing FPCA approaches, with the accompanying FDAdapt package enabling practical deployment.
Abstract
Functional data analysis almost always involves smoothing discrete observations into curves, because they are never observed in continuous time and rarely without error. Although smoothing parameters affect the subsequent inference, data-driven methods for selecting these parameters are not well-developed, frustrated by the difficulty of using all the information shared by curves while being computationally efficient. On the one hand, smoothing individual curves in an isolated, albeit sophisticated way, ignores useful signals present in other curves. On the other hand, bandwidth selection by automatic procedures such as cross-validation after pooling all the curves together quickly become computationally unfeasible due to the large number of data points. In this paper we propose a new data-driven, adaptive kernel smoothing, specifically tailored for functional principal components analysis through the derivation of sharp, explicit risk bounds for the eigen-elements. The minimization of these quadratic risk bounds provide refined, yet computationally efficient bandwidth rules for each eigen-element separately. Both common and independent design cases are allowed. Rates of convergence for the estimators are derived. An extensive simulation study, designed in a versatile manner to closely mimic the characteristics of real data sets supports our methodological contribution. An illustration on a real data application is provided.
