Table of Contents
Fetching ...

Fully Hyperbolic Convolutional Neural Networks for Computer Vision

Ahmad Bdeir, Kristian Schwethelm, Niels Landwehr

TL;DR

This paper tackles the limitation of existing hyperbolic neural networks in vision by proposing HCNN, a fully hyperbolic CNN built in the Lorentz model to learn hyperbolic representations across all layers. It introduces Lorentz-specific components—Lorentz convolution, Lorentz batch normalization, and Lorentz multinomial logistic regression—along with Lorentz-compatible residuals and activations, enabling end-to-end hyperbolic encoders. Empirical results on image classification and generation show that HCNNs, especially with the Lorentz model, achieve higher accuracy, improved robustness, and effective low-dimensional embeddings compared to Euclidean and Poincaré baselines. The work demonstrates the practical viability and stability advantages of fully hyperbolic vision models and provides a foundation for future scalable hyperbolic architectures in computer vision, with code publicly available.

Abstract

Real-world visual data exhibit intrinsic hierarchical structures that can be represented effectively in hyperbolic spaces. Hyperbolic neural networks (HNNs) are a promising approach for learning feature representations in such spaces. However, current HNNs in computer vision rely on Euclidean backbones and only project features to the hyperbolic space in the task heads, limiting their ability to fully leverage the benefits of hyperbolic geometry. To address this, we present HCNN, a fully hyperbolic convolutional neural network (CNN) designed for computer vision tasks. Based on the Lorentz model, we generalize fundamental components of CNNs and propose novel formulations of the convolutional layer, batch normalization, and multinomial logistic regression. {Experiments on standard vision tasks demonstrate the promising performance of our HCNN framework in both hybrid and fully hyperbolic settings.} Overall, we believe our contributions provide a foundation for developing more powerful HNNs that can better represent complex structures found in image data. Our code is publicly available at https://github.com/kschwethelm/HyperbolicCV.

Fully Hyperbolic Convolutional Neural Networks for Computer Vision

TL;DR

This paper tackles the limitation of existing hyperbolic neural networks in vision by proposing HCNN, a fully hyperbolic CNN built in the Lorentz model to learn hyperbolic representations across all layers. It introduces Lorentz-specific components—Lorentz convolution, Lorentz batch normalization, and Lorentz multinomial logistic regression—along with Lorentz-compatible residuals and activations, enabling end-to-end hyperbolic encoders. Empirical results on image classification and generation show that HCNNs, especially with the Lorentz model, achieve higher accuracy, improved robustness, and effective low-dimensional embeddings compared to Euclidean and Poincaré baselines. The work demonstrates the practical viability and stability advantages of fully hyperbolic vision models and provides a foundation for future scalable hyperbolic architectures in computer vision, with code publicly available.

Abstract

Real-world visual data exhibit intrinsic hierarchical structures that can be represented effectively in hyperbolic spaces. Hyperbolic neural networks (HNNs) are a promising approach for learning feature representations in such spaces. However, current HNNs in computer vision rely on Euclidean backbones and only project features to the hyperbolic space in the task heads, limiting their ability to fully leverage the benefits of hyperbolic geometry. To address this, we present HCNN, a fully hyperbolic convolutional neural network (CNN) designed for computer vision tasks. Based on the Lorentz model, we generalize fundamental components of CNNs and propose novel formulations of the convolutional layer, batch normalization, and multinomial logistic regression. {Experiments on standard vision tasks demonstrate the promising performance of our HCNN framework in both hybrid and fully hyperbolic settings.} Overall, we believe our contributions provide a foundation for developing more powerful HNNs that can better represent complex structures found in image data. Our code is publicly available at https://github.com/kschwethelm/HyperbolicCV.
Paper Structure (71 sections, 2 theorems, 53 equations, 7 figures, 12 tables)

This paper contains 71 sections, 2 theorems, 53 equations, 7 figures, 12 tables.

Key Result

Theorem 1

Given $a\in\mathbb{R}$ and $\bm{z}\in\mathbb{R}^n$, the minimum hyperbolic distance from a point $\bm{x}\in\mathbb{L}^n_K$ to the hyperplane $\tilde{H}_{\bm{z},a}$ defined in Eq. eq:hyppp is given by

Figures (7)

  • Figure 1: In contrast to hybrid HNNs that use a Euclidean CNN for feature extraction, our HCNN learns features in hyperbolic spaces in every layer, fully leveraging the benefits of hyperbolic geometry. This leads to better image representations and performance.
  • Figure 2: Comparison of Lorentz and Poincaré model.
  • Figure 3: CIFAR-100 accuracy obtained with lower dimensionalities in the final ResNet block.
  • Figure 4: Embeddings of MNIST dataset in 2D latent space of VAEs (with gen. FID). Colors represent golden labels and Lorentz embeddings are projected onto the Poincaré ball for better visualization.
  • Figure 5: Illustrations of geometrical operations in the 2-dimensional Lorentz model. (a) The shortest distance between two points is represented by the connecting geodesic (red line). (b) The red line gets projected onto the tangent space of the origin resulting in the green line. (c) The green line gets parallel transported to the tangent space of the origin.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2