Table of Contents
Fetching ...

Neural Operators Can Discover Functional Clusters

Yicen Li, Jose Antonio Lara Benitez, Ruiyang Hong, Anastasis Kratsios, Paul David McNicholas, Maarten Valentijn de Hoop

TL;DR

It is proved that sample-based neural operators can learn any finite collection of classes in an infinite-dimensional reproducing kernel Hilbert space, even when the classes are neither convex nor connected, under mild kernel sampling assumptions.

Abstract

Operator learning is reshaping scientific computing by amortizing inference across infinite families of problems. While neural operators (NOs) are increasingly well understood for regression, far less is known for classification and its unsupervised analogue: clustering. We prove that sample-based neural operators can learn any finite collection of classes in an infinite-dimensional reproducing kernel Hilbert space, even when the classes are neither convex nor connected, under mild kernel sampling assumptions. Our universal clustering theorem shows that any $K$ closed classes can be approximated to arbitrary precision by NO-parameterized classes in the upper Kuratowski topology on closed sets, a notion that can be interpreted as disallowing false-positive misclassifications. Building on this, we develop an NO-powered clustering pipeline for functional data and apply it to unlabeled families of ordinary differential equation (ODE) trajectories. Discretized trajectories are lifted by a fixed pre-trained encoder into a continuous feature map and mapped to soft assignments by a lightweight trainable head. Experiments on diverse synthetic ODE benchmarks show that the resulting practical SNO recovers latent dynamical structure in regimes where classical methods fail, providing evidence consistent with our universal clustering theory.

Neural Operators Can Discover Functional Clusters

TL;DR

It is proved that sample-based neural operators can learn any finite collection of classes in an infinite-dimensional reproducing kernel Hilbert space, even when the classes are neither convex nor connected, under mild kernel sampling assumptions.

Abstract

Operator learning is reshaping scientific computing by amortizing inference across infinite families of problems. While neural operators (NOs) are increasingly well understood for regression, far less is known for classification and its unsupervised analogue: clustering. We prove that sample-based neural operators can learn any finite collection of classes in an infinite-dimensional reproducing kernel Hilbert space, even when the classes are neither convex nor connected, under mild kernel sampling assumptions. Our universal clustering theorem shows that any closed classes can be approximated to arbitrary precision by NO-parameterized classes in the upper Kuratowski topology on closed sets, a notion that can be interpreted as disallowing false-positive misclassifications. Building on this, we develop an NO-powered clustering pipeline for functional data and apply it to unlabeled families of ordinary differential equation (ODE) trajectories. Discretized trajectories are lifted by a fixed pre-trained encoder into a continuous feature map and mapped to soft assignments by a lightweight trainable head. Experiments on diverse synthetic ODE benchmarks show that the resulting practical SNO recovers latent dynamical structure in regimes where classical methods fail, providing evidence consistent with our universal clustering theory.
Paper Structure (37 sections, 10 theorems, 45 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 37 sections, 10 theorems, 45 equations, 9 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

Let $\mathcal{K}$ be a non-empty subset of a Hilbert space $\mathcal{H}$, let $K\in \mathbb{N}_+$, and let $f_1,\dots,f_K\in \mathcal{K}$ be distinct. For every $k=1,\dots,K$ define the (cluster) set Then, for every $k=1,\dots,K$, the set $C_k$ is a non-empty closed subset of $\mathcal{K}$.

Figures (9)

  • Figure 1: (Left: K-Means) Standard finite-dimensional $K$-means relies on mapping data to discrete cluster centers $\mu_1,\mu_2$ in a reduced feature space, often discarding continuous dynamics. (Right: Neural Operator) In contrast, our proposed method leverages a single neural operator to generate continuous cluster indicator functions. The blue and green curves represent the two output components $\hat{f}_1$ and $\hat{f}_2$ of the neural operator (one for each output dimension), and the corresponding cluster regions are defined via thresholding at $\gamma$.
  • Figure 2: Geometric intuition of set-valued clustering via upper Kuratowski convergence. Unlike point-wise estimation, the learned cluster support $P_{\theta}$ (green) approximates the ground truth $P_{\text{data}}$ (black) strictly from the interior. Whether in the intermediate phase (a) or near convergence (b), the approximation is constrained to stay within the target boundaries, thereby maximizing cluster purity. This ensures no false positives, meaning the model prevents infeasible behaviors lying outside the target set.
  • Figure 3: Architecture of the Sampling-Based Neural Operator (SNO). (a) The discretization process samples the function $f \in \mathcal{H}$ via inner products with reproducing kernels $\kappa(\cdot, x_s)$ centered at points $x_s$ to form the input vector $v_0 \in \mathbb{R}^S$. (b) The deep neural network maps the discretized input $v_0$ through multiple layers of affine maps $A_l$ and nonlinearities (ReLU) to produce output logits, defining the clusters $C_k$. Note that, only the first layers is a non-local kernel-based operator acting on the function space, and the remaining layers are lightweight standard deep learning models; in our analysis an MLP.
  • Figure 4: Visual comparison of trajectory structures across all systems in ODE-6. Each plot overlays 5 randomly selected trajectories. Due to path overlaps, fewer than 5 distinct lines may be visible in some panels. Rows represent different dynamical systems, and columns illustrate the effect of increasing resolution $S$ from 4 (left) to 224 (right).
  • Figure 5: We evaluate the SNO model with 5 random seeds across varying sampling resolutions, ranging from a coarse $2 \times 2$ to $224 \times 224$ on the ODE-6 test set. The improvement of clustering metrics empirically demonstrates that the sequence of discretized operators converges toward a stable limit as the sampling density increases in this structured regime. This corroborates our claim that SNO approximates the true cluster partition.
  • ...and 4 more figures

Theorems & Definitions (20)

  • Proposition 1: Representation of Pure Clusters
  • Definition 1: Sampling-Based Neural Operators
  • Theorem 1: Universal Clustering
  • proof
  • proof : Proof of Proposition \ref{['prop:classifier_representation__prelimVersion']}
  • Proposition 2: Representation of Pure Clusters
  • Lemma 1
  • proof : Proof of Lemma \ref{['lem:classification_open__existence']}
  • Lemma 2
  • proof : Proof of Lemma \ref{['lem:Kclusters_are_open']}
  • ...and 10 more