Robust functional PCA for relative data
Jeremy Oguamalam, Peter Filzmoser, Karel Hron, Alessandra Menafoglio, Una Radojičić
TL;DR
This work addresses the challenge of robustly extracting principal modes of variation from relative functional data, such as density curves, within the Bayes space framework. It extends the Mahalanobis distance to Bayes spaces through regularized standardization, defining the regularized density Mahalanobis distance (RDMD) and establishing its connection to existing functional distances. Building on this, the authors develop Robust Density PCA (RDPCA), featuring trimmed Bayes covariance estimation and an iterative HD-based subset selection algorithm to obtain robust covariance operators and principal components for densities. The method is validated via simulations with contamination and two real-data applications (EPXMA spectra and fertility densities), demonstrating improved covariance estimation, more accurate PCs, and effective outlier detection, with discussion of extensions to sparse data and multivariate densities.
Abstract
This paper introduces a robust approach to functional principal component analysis (FPCA) for relative data, particularly density functions. While recent papers have studied density data within the Bayes space framework, there has been limited focus on developing robust methods to effectively handle anomalous observations and large noise. To address this, we extend the Mahalanobis distance concept to Bayes spaces, proposing its regularized version that accounts for the constraints inherent in density data. Based on this extension, we introduce a new method, robust density principal component analysis (RDPCA), for more accurate estimation of functional principal components in the presence of outliers. The method's performance is validated through simulations and real-world applications, showing its ability to improve covariance estimation and principal component analysis compared to traditional methods.
