Table of Contents
Fetching ...

Hyperbolic Busemann Neural Networks

Ziheng Chen, Bernhard Schölkopf, Nicu Sebe

TL;DR

This work lifts two core components of neural networks, Multinomial Logistic Regression and Fully Connected layers, into hyperbolic space via Busemann functions, resulting in Busemann MLR (BMLR) and Busemann FC (BFC) layers with a unified mathematical interpretation.

Abstract

Hyperbolic spaces provide a natural geometry for representing hierarchical and tree-structured data due to their exponential volume growth. To leverage these benefits, neural networks require intrinsic and efficient components that operate directly in hyperbolic space. In this work, we lift two core components of neural networks, Multinomial Logistic Regression (MLR) and Fully Connected (FC) layers, into hyperbolic space via Busemann functions, resulting in Busemann MLR (BMLR) and Busemann FC (BFC) layers with a unified mathematical interpretation. BMLR provides compact parameters, a point-to-horosphere distance interpretation, batch-efficient computation, and a Euclidean limit, while BFC generalizes FC and activation layers with comparable complexity. Experiments on image classification, genome sequence learning, node classification, and link prediction demonstrate improvements in effectiveness and efficiency over prior hyperbolic layers. The code is available at https://github.com/GitZH-Chen/HBNN.

Hyperbolic Busemann Neural Networks

TL;DR

This work lifts two core components of neural networks, Multinomial Logistic Regression and Fully Connected layers, into hyperbolic space via Busemann functions, resulting in Busemann MLR (BMLR) and Busemann FC (BFC) layers with a unified mathematical interpretation.

Abstract

Hyperbolic spaces provide a natural geometry for representing hierarchical and tree-structured data due to their exponential volume growth. To leverage these benefits, neural networks require intrinsic and efficient components that operate directly in hyperbolic space. In this work, we lift two core components of neural networks, Multinomial Logistic Regression (MLR) and Fully Connected (FC) layers, into hyperbolic space via Busemann functions, resulting in Busemann MLR (BMLR) and Busemann FC (BFC) layers with a unified mathematical interpretation. BMLR provides compact parameters, a point-to-horosphere distance interpretation, batch-efficient computation, and a Euclidean limit, while BFC generalizes FC and activation layers with comparable complexity. Experiments on image classification, genome sequence learning, node classification, and link prediction demonstrate improvements in effectiveness and efficiency over prior hyperbolic layers. The code is available at https://github.com/GitZH-Chen/HBNN.
Paper Structure (51 sections, 9 theorems, 94 equations, 2 figures, 20 tables)

This paper contains 51 sections, 9 theorems, 94 equations, 2 figures, 20 tables.

Key Result

Theorem 3.1

As $K\to 0^{-}$, the hyperbolic Busemann functions converge to the Euclidean inner product: The hyperbolic BMLRs converge to the Euclidean MLR:

Figures (2)

  • Figure 1: Illustration: red curves are different horospheres of $B^{v}$.
  • Figure 2: Validation accuracy curves on ImageNet-1k.

Theorems & Definitions (37)

  • Theorem 3.1: Limits of BMLRs
  • proof
  • Remark 1: Intuition
  • Theorem 3.2: Hadamard horosphere distance
  • proof
  • Corollary 1: Point-to-horosphere distance
  • Remark 2: Generality
  • Theorem 4.1
  • proof
  • Theorem 4.2
  • ...and 27 more