Table of Contents
Fetching ...

HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell Data

Siddharth Viswanath, Hiren Madhu, Dhananjay Bhaskar, Jake Kovalic, David R Johnson, Christopher Tape, Ian Adelstein, Rex Ying, Michael Perlmutter, Smita Krishnaswamy

TL;DR

HiPoNet tackles the challenge of learning from high-dimensional point clouds common in single-cell and spatial transcriptomics. It integrates multi-view feature weighting, higher-order simplicial complex modeling, and multiscale simplicial wavelet scattering to produce representations that preserve geometry and topology. Theoretical results establish diffusion-based properties on simplicial complexes and their relation to geodesic distances, while empirical results show state-of-the-art performance on diverse biological datasets and provide interpretability through view-specific marker importance. Overall, the approach offers a scalable, end-to-end differentiable solution for high-dimensional data analysis with meaningful biological insights.

Abstract

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning on high-dimensional point clouds. Our work is motivated by single-cell data which can have very high-dimensionality --exceeding the capabilities of existing methods for point clouds which are mostly tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e., one data set for every patient), necessitating models that can process large, high-dimensional point-clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric and topological information. In contrast, HiPoNet models the point-cloud as a set of higher-order simplicial complexes, with each particular complex being created using a reweighting of features. This method thus generates multiple constructs corresponding to different views of high-dimensional data, which in biology offers the possibility of disentangling distinct cellular processes. It then employs simplicial wavelet transforms to extract multiscale features, capturing both local and global topology from each view. We show that geometric and topological information is preserved in this framework both theoretically and empirically. We showcase the utility of HiPoNet on point-cloud level tasks, involving classification and regression of entire point-clouds in data cohorts. Experimentally, we find that HiPoNet outperforms other point-cloud and graph-based models on single-cell data. We also apply HiPoNet to spatial transcriptomics datasets using spatial coordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell Data

TL;DR

HiPoNet tackles the challenge of learning from high-dimensional point clouds common in single-cell and spatial transcriptomics. It integrates multi-view feature weighting, higher-order simplicial complex modeling, and multiscale simplicial wavelet scattering to produce representations that preserve geometry and topology. Theoretical results establish diffusion-based properties on simplicial complexes and their relation to geodesic distances, while empirical results show state-of-the-art performance on diverse biological datasets and provide interpretability through view-specific marker importance. Overall, the approach offers a scalable, end-to-end differentiable solution for high-dimensional data analysis with meaningful biological insights.

Abstract

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning on high-dimensional point clouds. Our work is motivated by single-cell data which can have very high-dimensionality --exceeding the capabilities of existing methods for point clouds which are mostly tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e., one data set for every patient), necessitating models that can process large, high-dimensional point-clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric and topological information. In contrast, HiPoNet models the point-cloud as a set of higher-order simplicial complexes, with each particular complex being created using a reweighting of features. This method thus generates multiple constructs corresponding to different views of high-dimensional data, which in biology offers the possibility of disentangling distinct cellular processes. It then employs simplicial wavelet transforms to extract multiscale features, capturing both local and global topology from each view. We show that geometric and topological information is preserved in this framework both theoretically and empirically. We showcase the utility of HiPoNet on point-cloud level tasks, involving classification and regression of entire point-clouds in data cohorts. Experimentally, we find that HiPoNet outperforms other point-cloud and graph-based models on single-cell data. We also apply HiPoNet to spatial transcriptomics datasets using spatial coordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

Paper Structure

This paper contains 21 sections, 7 theorems, 19 equations, 1 figure, 12 tables, 1 algorithm.

Key Result

Theorem 4.1

The heat equations, Eq eqn:heat_eqn_simplicial_complex, respect the $0$-homology structure of the simplicial complex.

Figures (1)

  • Figure 1: The HiPoNet pipeline.

Theorems & Definitions (15)

  • Theorem 4.1
  • proof : Proof Sketch
  • Definition 4.2: Simplicial Graph
  • Theorem 4.3
  • proof : Proof Sketch
  • Theorem 4.4
  • proof : Proof Sketch
  • Corollary 4.5
  • Theorem C.1: Connected Components on Simplicial Complexes
  • proof
  • ...and 5 more