Table of Contents
Fetching ...

Shape-Informed Clustering of Multi-Dimensional Functional Data via Deep Functional Autoencoders

Samuel Singh, Shirley Coyle, Mimi Zhang

TL;DR

FAEclust tackles clustering of multi-dimensional functional data, including manifold-valued curves, by learning a shape-aware latent representation with a deep functional autoencoder. The framework couples a universal-approximator decoder with a convex, similarity-informed clustering objective and a path-following algorithm that builds a full clustering hierarchy in $O(n \,\\log(n))$, selecting the number of clusters via internal validation. The approach is reinforced with regularization on functional weights (orthogonality and roughness) and a penalized reconstruction objective to stabilize training. Empirical results across Euclidean and manifold-valued datasets, including time-warped scenarios, show state-of-the-art clustering performance and robustness to phase variation, highlighting practical impact for complex functional data analysis.

Abstract

We introduce FAEclust, a novel functional autoencoder framework for cluster analysis of multi-dimensional functional data, data that are random realizations of vector-valued random functions. Our framework features a universal-approximator encoder that captures complex nonlinear interdependencies among component functions, and a universal-approximator decoder capable of accurately reconstructing both Euclidean and manifold-valued functional data. Stability and robustness are enhanced through innovative regularization strategies applied to functional weights and biases. Additionally, we incorporate a clustering loss into the network's training objective, promoting the learning of latent representations that are conducive to effective clustering. A key innovation is our shape-informed clustering objective, ensuring that the clustering results are resistant to phase variations in the functions. We establish the universal approximation property of our non-linear decoder and validate the effectiveness of our model through extensive experiments.

Shape-Informed Clustering of Multi-Dimensional Functional Data via Deep Functional Autoencoders

TL;DR

FAEclust tackles clustering of multi-dimensional functional data, including manifold-valued curves, by learning a shape-aware latent representation with a deep functional autoencoder. The framework couples a universal-approximator decoder with a convex, similarity-informed clustering objective and a path-following algorithm that builds a full clustering hierarchy in , selecting the number of clusters via internal validation. The approach is reinforced with regularization on functional weights (orthogonality and roughness) and a penalized reconstruction objective to stabilize training. Empirical results across Euclidean and manifold-valued datasets, including time-warped scenarios, show state-of-the-art clustering performance and robustness to phase variation, highlighting practical impact for complex functional data analysis.

Abstract

We introduce FAEclust, a novel functional autoencoder framework for cluster analysis of multi-dimensional functional data, data that are random realizations of vector-valued random functions. Our framework features a universal-approximator encoder that captures complex nonlinear interdependencies among component functions, and a universal-approximator decoder capable of accurately reconstructing both Euclidean and manifold-valued functional data. Stability and robustness are enhanced through innovative regularization strategies applied to functional weights and biases. Additionally, we incorporate a clustering loss into the network's training objective, promoting the learning of latent representations that are conducive to effective clustering. A key innovation is our shape-informed clustering objective, ensuring that the clustering results are resistant to phase variations in the functions. We establish the universal approximation property of our non-linear decoder and validate the effectiveness of our model through extensive experiments.

Paper Structure

This paper contains 34 sections, 1 theorem, 30 equations, 13 figures, 9 tables.

Key Result

Theorem 1

Let $\mathcal{F}=\{f: \mathbb{R}^s\mapsto \mathcal{H}(\mathcal{T}, \mathcal{M})\}$ denote a family of continuous mappings. When $\mathcal{M}$ is a $p$-dimensional Euclidean space, for any $f\in\mathcal{F}$ and $\epsilon\in(0, 1)$, there exists a functional network $\mathcal{D}: \mathbb{R}^s\mapsto \

Figures (13)

  • Figure 1: An illustrative FAE architecture, where we have five hidden layers.
  • Figure 2: The joint network training and clustering framework.
  • Figure 3: A functional network $\mathcal{D}: \mathbb{R}^s\mapsto \mathcal{H}(\mathcal{T}, \mathbb{R})$ with only one hidden layer.
  • Figure 4: The order of the different layers: the fully connected layer is followed by batch normalization, then non-linear activation, and finally dropout.
  • Figure 5: Hypersphere functional data. Trajectories on the surface of a hypersphere $S^2$ , with clusters defined by distinct great-circle paths and phase shifts.
  • ...and 8 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof