Table of Contents
Fetching ...

Noisy Data Visualization using Functional Data Analysis

Haozhe Chen, Andres Felipe Duque Correa, Guy Wolf, Kevin R. Moon

TL;DR

This work tackles noisy data visualization by learning distances between latent dynamical parameters without density estimation. It introduces Functional Information Geometry (FIG), which uses functional data analysis to define a functional Mahalanobis distance between local densities and integrates diffusion-based embedding via PHATE for visualization. The method learns functional principal component scores from neighbor densities and constructs distances that are then embedded to reveal the intrinsic low-dimensional structure. Empirical results on simulated dynamics and EEG sleep data show FIG outperforms histogram-based EIG/DIG approaches, offering greater robustness to hyperparameters and faster computation, with practical impact for scalable visualization of noisy time-series.

Abstract

Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many existing methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating all noise. However, implementing EIG in practice requires the construction of high-dimensional histograms, which suffer from the curse of dimensionality. Here we propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes that adapts the EIG framework while using approaches from functional data analysis to mitigate the curse of dimensionality. We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization in terms of capturing the true structure, hyperparameter robustness, and computational speed. We then use our method to visualize EEG brain measurements of sleep activity.

Noisy Data Visualization using Functional Data Analysis

TL;DR

This work tackles noisy data visualization by learning distances between latent dynamical parameters without density estimation. It introduces Functional Information Geometry (FIG), which uses functional data analysis to define a functional Mahalanobis distance between local densities and integrates diffusion-based embedding via PHATE for visualization. The method learns functional principal component scores from neighbor densities and constructs distances that are then embedded to reveal the intrinsic low-dimensional structure. Empirical results on simulated dynamics and EEG sleep data show FIG outperforms histogram-based EIG/DIG approaches, offering greater robustness to hyperparameters and faster computation, with practical impact for scalable visualization of noisy time-series.

Abstract

Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many existing methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating all noise. However, implementing EIG in practice requires the construction of high-dimensional histograms, which suffer from the curse of dimensionality. Here we propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes that adapts the EIG framework while using approaches from functional data analysis to mitigate the curse of dimensionality. We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization in terms of capturing the true structure, hyperparameter robustness, and computational speed. We then use our method to visualize EEG brain measurements of sleep activity.
Paper Structure (10 sections, 25 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 10 sections, 25 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Simulated data setup. (Left) Segment of the 3D movement of the object on the sphere. (Right) Corresponding segment of the 2D trajectory of the two independent factors: the horizontal and vertical angles.
  • Figure 2: Mantel coefficient between different embedding distances and the ground truth parameters $\boldsymbol{\theta}$ of the simulated random walk. FIG outperforms all methods in the high noise setting and is competitive in the low noise setting.
  • Figure 3: Comparison of FIG and DIG on EEG brain measurements during different sleep stages . FIG is more robust to different window sizes than DIG. (a) A visual comparison of FIG and DIG. Parts (b) and (c) show pairwise Mantel correlations between the embeddings.
  • Figure 4: 2D embeddings of the EEG data using other methods, colored by the same labels (sleep stages). These methods are unable to capture any of the structure in the data, in contrast with both FIG and DIG (Figure \ref{['fig:eegresults']}).
  • Figure 5: The standard deviations of Mantel test results for DIG and FIG on the EEG data computed from 5 runs. The corresponding average Mantel correlations are given in Figure \ref{['fig:eegresults']}. The standard deviations are generally low, especially for FIG, indicating our results are reproducible.
  • ...and 1 more figures