Functional Singular Value Decomposition
Jianbin Tan, Pixu Shi, Anru R. Zhang
TL;DR
FSVD introduces a unified, low-rank framework for heterogeneous functional data, defining [X_1,...,X_n]^T= \sum_r \rho_r \bm{a}_r \phi_r with orthonormal components and establishing existence plus basic properties. It extends FPCA and factor-model viewpoints to irregularly observed and non-identically distributed data via an RKHS-based rank-one kernel ridge regression and an alternating minimization algorithm, with rigorous convergence guarantees for the first component. By introducing intrinsic basis functions and intrinsic basis vectors, FSVD captures both functional and tabular heterogeneity, enabling functional completion, clustering, and regression without covariance estimation of the data. Through simulations and real data (COVID-19 trajectories and ICU EHRs), FSVD shows superior performance in pattern discovery, missing data completion, and predictive tasks, offering a flexible toolkit for a wide range of functional-data analyses.
Abstract
Heterogeneous functional data commonly arise in time series and longitudinal studies. To uncover the statistical structures of such data, we propose Functional Singular Value Decomposition (FSVD), a unified framework encompassing various tasks for the analysis of functional data with potential heterogeneity. We establish the mathematical foundation of FSVD by proving its existence and providing its fundamental properties. We then develop an implementation approach for noisy and irregularly observed functional data based on a novel alternating minimization scheme and provide theoretical guarantees for its convergence and estimation accuracy. The FSVD framework also introduces the concepts of intrinsic basis functions and intrinsic basis vectors, representing two fundamental structural aspects of random functions. These concepts enable FSVD to provide new and improved solutions to tasks including functional principal component analysis, factor models, functional clustering, functional linear regression, and functional completion, while effectively handling heterogeneity and irregular temporal sampling. Through extensive simulations, we demonstrate that FSVD-based methods consistently outperform existing methods across these tasks. To showcase the value of FSVD in real-world datasets, we apply it to extract temporal patterns from a COVID-19 case count dataset and perform data completion on an electronic health record dataset.
