Noisy Data Visualization using Functional Data Analysis
Haozhe Chen, Andres Felipe Duque Correa, Guy Wolf, Kevin R. Moon
TL;DR
This work tackles noisy data visualization by learning distances between latent dynamical parameters without density estimation. It introduces Functional Information Geometry (FIG), which uses functional data analysis to define a functional Mahalanobis distance between local densities and integrates diffusion-based embedding via PHATE for visualization. The method learns functional principal component scores from neighbor densities and constructs distances that are then embedded to reveal the intrinsic low-dimensional structure. Empirical results on simulated dynamics and EEG sleep data show FIG outperforms histogram-based EIG/DIG approaches, offering greater robustness to hyperparameters and faster computation, with practical impact for scalable visualization of noisy time-series.
Abstract
Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many existing methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating all noise. However, implementing EIG in practice requires the construction of high-dimensional histograms, which suffer from the curse of dimensionality. Here we propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes that adapts the EIG framework while using approaches from functional data analysis to mitigate the curse of dimensionality. We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization in terms of capturing the true structure, hyperparameter robustness, and computational speed. We then use our method to visualize EEG brain measurements of sleep activity.
