O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

Pavlo Melnyk; Michael Felsberg; Mårten Wadenbäck; Andreas Robinson; Cuong Le

O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck, Andreas Robinson, Cuong Le

TL;DR

This work addresses learning deep features that are equivariant to orthogonal transformations in arbitrary dimensions by introducing Deep Equivariant Hyperspheres (DEH), which combine regular $n$-simplexes with $n$-dimensional spherical decision surfaces. The authors derive a simplex-based simplex change-of-basis $M_n$, construct $n$D equivariant spheres, and cascade them to build deep, point-based representations; they also propose an invariant Gram-based operator $oldsymbol{ riangle}= extbf{Y} extbf{Y}^ op$ to capture higher-order relations. Theoretical results establish $O(n)$-equivariance of the neuron and practical techniques for normalization, bias, and higher-order interactions, complemented by empirical validation on $ ext{O}(3)$ and $ ext{O}(5)$ tasks where DEH outperforms several baselines while offering favorable speed/performance trade-offs. The approach generalizes to any dimension, enabling scalable, geometry-aware learning for 3D/4D data with potential applications in molecular design and related domains; code is released at the provided repository.

Abstract

In this paper, we utilize hyperspheres and regular $n$-simplexes and propose an approach to learning deep features equivariant under the transformations of $n$D reflections and rotations, encompassed by the powerful group of O$(n)$. Namely, we propose O$(n)$-equivariant neurons with spherical decision surfaces that generalize to any dimension $n$, which we call Deep Equivariant Hyperspheres. We demonstrate how to combine them in a network that directly operates on the basis of the input points and propose an invariant operator based on the relation between two points and a sphere, which as we show, turns out to be a Gram matrix. Using synthetic and real-world data in $n$D, we experimentally verify our theoretical contributions and find that our approach is superior to the competing methods for O$(n)$-equivariant benchmark datasets (classification and regression), demonstrating a favorable speed/performance trade-off. The code is available at https://github.com/pavlo-melnyk/equivariant-hyperspheres.

O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

TL;DR

This work addresses learning deep features that are equivariant to orthogonal transformations in arbitrary dimensions by introducing Deep Equivariant Hyperspheres (DEH), which combine regular

-simplexes with

-dimensional spherical decision surfaces. The authors derive a simplex-based simplex change-of-basis

, construct

D equivariant spheres, and cascade them to build deep, point-based representations; they also propose an invariant Gram-based operator

to capture higher-order relations. Theoretical results establish

-equivariance of the neuron and practical techniques for normalization, bias, and higher-order interactions, complemented by empirical validation on

and

tasks where DEH outperforms several baselines while offering favorable speed/performance trade-offs. The approach generalizes to any dimension, enabling scalable, geometry-aware learning for 3D/4D data with potential applications in molecular design and related domains; code is released at the provided repository.

Abstract

In this paper, we utilize hyperspheres and regular

-simplexes and propose an approach to learning deep features equivariant under the transformations of

D reflections and rotations, encompassed by the powerful group of O

. Namely, we propose O

-equivariant neurons with spherical decision surfaces that generalize to any dimension

, which we call Deep Equivariant Hyperspheres. We demonstrate how to combine them in a network that directly operates on the basis of the input points and propose an invariant operator based on the relation between two points and a sphere, which as we show, turns out to be a Gram matrix. Using synthetic and real-world data in

D, we experimentally verify our theoretical contributions and find that our approach is superior to the competing methods for O

-equivariant benchmark datasets (classification and regression), demonstrating a favorable speed/performance trade-off. The code is available at https://github.com/pavlo-melnyk/equivariant-hyperspheres.

Paper Structure (35 sections, 9 theorems, 28 equations, 4 figures, 4 tables)

This paper contains 35 sections, 9 theorems, 28 equations, 4 figures, 4 tables.

Introduction
Related work
Background
Spherical neurons via non-linear embedding
Equi- and invariance under $\mathop{\mathrm{\textup{O}}}\nolimits(n)$-transformations
Steerable 3D spherical neurons and TetraSphere
TetraSphere
Regular simplexes
Deep Equivariant Hyperspheres
The simplex change of basis
Equivariant nD spheres
Normalization and additional non-linearity
Extracting deep equivariant features
Modelling higher-order interactions
Experimental validation
...and 20 more sections

Key Result

Proposition 1

Let $\textup{M}_n$ be the-change-of-basis matrix defined in eq:nd_basis_matrix. Then $\textup{M}_n$ is an $(n+1)$D rotation or reflection, , $\textup{M}_n \in \mathop{\mathrm{\textup{O}}}\nolimits(n+1)$ (see Section sec:A_numeric_instances in the Appendix for numeric examples).

Figures (4)

Figure 1: The central components of Deep Equivariant Hyperspheres (best viewed in color): regular $n$-simplexes with the $n$D spherical decision surfaces located at their vertices and the simplex change-of-basis matrices $\textbf{M}_n$ (displayed for $n=2$ and $n=3$).
Figure 2: Left: real data experiment (the higher the accuracy the better); all the presented models are also permutation-invariant. Center and right: synthetic data experiments (the lower the mean squared error (MSE) the better); dotted lines mean that the results of the methods are copied from finzi2021practical ($\mathop{\mathrm{\textup{O}}}\nolimits(5)$ regression) or ruhe2023clifford ($\mathop{\mathrm{\textup{O}}}\nolimits(5)$ convex hulls). Best viewed in color.
Figure 3: Speed/performance trade-off (the models are trained on all the available training data). Note that the desired trade-off is toward the top-left corner (higher accuracy and faster inference) in the left figure, and toward the bottom-left corner (lower error and faster inference) in the center and right figures. To measure inference time, we used an NVIDIA A100. Best viewed in color.
Figure 4: Architecture of our DEH model. All the operations are point-wise, , shared amongst $N$ points. Each subsequent layer of equivariant hyperspheres contains $K_l$ neurons for each of the $\prod_i^{d} K_i$ preceding layer channels. The architectures of the non-permutation-invariant variants differ only in that the global aggregation function over $N$ is substituted with the flattening of the feature map.

Theorems & Definitions (17)

Proposition 1
proof
Lemma 2
proof
Proposition 3
proof
Theorem 4
proof
Proposition 5
proof
...and 7 more

O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

TL;DR

Abstract

O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (17)