Table of Contents
Fetching ...

Robust Hyperbolic Learning with Curvature-Aware Optimization

Ahmad Bdeir, Johannes Burchert, Lars Schmidt-Thieme, Niels Landwehr

TL;DR

The work tackles instability, overfitting, and high computational cost in hyperbolic deep learning by formulating a curvature-aware optimization framework on the Lorentz manifold. It introduces a Riemannian AdamW optimizer, a curvature-aware update mechanism with tangent-space projections and parallel transport, and a flexible maximum-distance rescaling strategy to keep hyperbolic embeddings within the representational radius. A tanh-based scaling function and CUDA-friendly hyperbolic convolutions further enhance stability and efficiency, enabling low-precision training. Empirically, the approach yields state-of-the-art or competitive results across hierarchical metric learning, EEG classification, and image tasks, while delivering substantial runtime and memory savings, highlighting the practical viability of curvature-adaptive hyperbolic learning. All formulations are grounded in the Lorentz model with key quantities such as the distance $d_{\oldsymbol{L}}$, exponential/log maps, and curvature parameter $K$ carefully maintained during optimization.

Abstract

Hyperbolic deep learning has become a growing research direction in computer vision due to the unique properties afforded by the alternate embedding space. The negative curvature and exponentially growing distance metric provide a natural framework for capturing hierarchical relationships between datapoints and allowing for finer separability between their embeddings. However, current hyperbolic learning approaches are still prone to overfitting, computationally expensive, and prone to instability, especially when attempting to learn the manifold curvature to adapt to tasks and different datasets. To address these issues, our paper presents a derivation for Riemannian AdamW that helps increase hyperbolic generalization ability. For improved stability, we introduce a novel fine-tunable hyperbolic scaling approach to constrain hyperbolic embeddings and reduce approximation errors. Using this along with our curvature-aware learning schema for Lorentzian Optimizers enables the combination of curvature and non-trivialized hyperbolic parameter learning. Our approach demonstrates consistent performance improvements across Computer Vision, EEG classification, and hierarchical metric learning tasks achieving state-of-the-art results in two domains and drastically reducing runtime.

Robust Hyperbolic Learning with Curvature-Aware Optimization

TL;DR

The work tackles instability, overfitting, and high computational cost in hyperbolic deep learning by formulating a curvature-aware optimization framework on the Lorentz manifold. It introduces a Riemannian AdamW optimizer, a curvature-aware update mechanism with tangent-space projections and parallel transport, and a flexible maximum-distance rescaling strategy to keep hyperbolic embeddings within the representational radius. A tanh-based scaling function and CUDA-friendly hyperbolic convolutions further enhance stability and efficiency, enabling low-precision training. Empirically, the approach yields state-of-the-art or competitive results across hierarchical metric learning, EEG classification, and image tasks, while delivering substantial runtime and memory savings, highlighting the practical viability of curvature-adaptive hyperbolic learning. All formulations are grounded in the Lorentz model with key quantities such as the distance , exponential/log maps, and curvature parameter carefully maintained during optimization.

Abstract

Hyperbolic deep learning has become a growing research direction in computer vision due to the unique properties afforded by the alternate embedding space. The negative curvature and exponentially growing distance metric provide a natural framework for capturing hierarchical relationships between datapoints and allowing for finer separability between their embeddings. However, current hyperbolic learning approaches are still prone to overfitting, computationally expensive, and prone to instability, especially when attempting to learn the manifold curvature to adapt to tasks and different datasets. To address these issues, our paper presents a derivation for Riemannian AdamW that helps increase hyperbolic generalization ability. For improved stability, we introduce a novel fine-tunable hyperbolic scaling approach to constrain hyperbolic embeddings and reduce approximation errors. Using this along with our curvature-aware learning schema for Lorentzian Optimizers enables the combination of curvature and non-trivialized hyperbolic parameter learning. Our approach demonstrates consistent performance improvements across Computer Vision, EEG classification, and hierarchical metric learning tasks achieving state-of-the-art results in two domains and drastically reducing runtime.
Paper Structure (55 sections, 15 equations, 2 figures, 8 tables, 2 algorithms)

This paper contains 55 sections, 15 equations, 2 figures, 8 tables, 2 algorithms.

Figures (2)

  • Figure 1: Tangent planes of a hyperboloid with curvature -1 relative to another hyperboloid with curvature -0.7. Tangential properties between manifolds are better respected at the origin where tangents remain parallel.
  • Figure 2: The output of the proposed flexible tanh function. Here the maximum value m is set to 9.1 in the vanilla version with an alternate value of m=18 and the slope s is set to 2.6 with an alternate value of 3.5