From sparse to dense functional data in high dimensions: Revisiting phase transitions from a non-asymptotic perspective
Shaojun Guo, Dong Li, Xinghao Qiao, Yizhu Wang
TL;DR
This work studies nonparametric mean and covariance estimation for high-dimensional, partially observed functional data using a unified local linear smoothing framework. It establishes non-asymptotic, generalized sub-Gaussian concentration bounds in both $L_2$ and supremum norms, derives exact elementwise maximum convergence rates, and reveals scaled phase transitions as the average sampling frequency per subject grows relative to $n$ and $\log p$. The results underpin FPCA-based procedures, including FPCA, sparse FPCA, and functional thresholding, by providing sharp rates for covariance estimation and eigenstructure recovery in high dimensions. Simulations corroborate the theory, showing phase-transition–like behavior across sparse, semi-dense, and ultra-dense regimes and illustrating the impact of $\log p$ on error rates. The framework extends previous asymptotic phase transition analyses to a non-asymptotic, high-dimensional setting, enabling rigorous guarantees for downstream functional data analysis in applications with many functional variables.
Abstract
Nonparametric estimation of the mean and covariance functions is ubiquitous in functional data analysis and local linear smoothing techniques are most frequently used. Zhang and Wang (2016) explored different types of asymptotic properties of the estimation, which reveal interesting phase transition phenomena based on the relative order of the average sampling frequency per subject $T$ to the number of subjects $n$, partitioning the data into three categories: "sparse", "semi-dense", and "ultra-dense". In an increasingly available high-dimensional scenario, where the number of functional variables $p$ is large in relation to $n$, we revisit this open problem from a non-asymptotic perspective by deriving comprehensive concentration inequalities for the local linear smoothers. Besides being of interest by themselves, our non-asymptotic results lead to elementwise maximum rates of $L_2$ convergence and uniform convergence serving as a fundamentally important tool for further convergence analysis when $p$ grows exponentially with $n$ and possibly $T$. With the presence of extra $\log p$ terms to account for the high-dimensional effect, we then investigate the scaled phase transitions and the corresponding elementwise maximum rates from sparse to semi-dense to ultra-dense functional data in high dimensions. We also discuss a couple of applications of our theoretical results. Finally, numerical studies are carried out to confirm the established theoretical properties.
