Table of Contents
Fetching ...

Clustering in hyperbolic balls

Vladimir Jaćimović, Aladin Crnkić

TL;DR

This work addresses the lack of principled clustering tools for hyperbolic data by developing a rigorous framework based on conformal barycenters and a Möbius distribution family. It introduces k-means in hyperbolic balls and an EM algorithm for learning Möbius mixtures, applicable to both the Poincaré disc and higher-dimensional Poincaré balls. The key contributions include explicit formulations for conformal and weighted barycenters, random variate generation, MLE procedures, and practical algorithms with comprehensive experiments on synthetic Möbius mixtures in multiple dimensions. The results establish foundational methods for unsupervised learning in hyperbolic spaces, enabling future hyperbolic deep learning pipelines and principled statistical modeling on negatively curved manifolds.

Abstract

The idea of representations of the data in negatively curved manifolds recently attracted a lot of attention and gave a rise to the new research direction named {\it hyperbolic machine learning} (ML). In order to unveil the full potential of this new paradigm, efficient techniques for data analysis and statistical modeling in hyperbolic spaces are necessary. In the present paper rigorous mathematical framework for clustering in hyperbolic spaces is established. First, we introduce the $k$-means clustering in hyperbolic balls, based on the novel definition of barycenter. Second, we present the expectation-maximization (EM) algorithm for learning mixtures of novel probability distributions in hyperbolic balls. In such a way we lay the foundation of unsupervised learning in hyperbolic spaces.

Clustering in hyperbolic balls

TL;DR

This work addresses the lack of principled clustering tools for hyperbolic data by developing a rigorous framework based on conformal barycenters and a Möbius distribution family. It introduces k-means in hyperbolic balls and an EM algorithm for learning Möbius mixtures, applicable to both the Poincaré disc and higher-dimensional Poincaré balls. The key contributions include explicit formulations for conformal and weighted barycenters, random variate generation, MLE procedures, and practical algorithms with comprehensive experiments on synthetic Möbius mixtures in multiple dimensions. The results establish foundational methods for unsupervised learning in hyperbolic spaces, enabling future hyperbolic deep learning pipelines and principled statistical modeling on negatively curved manifolds.

Abstract

The idea of representations of the data in negatively curved manifolds recently attracted a lot of attention and gave a rise to the new research direction named {\it hyperbolic machine learning} (ML). In order to unveil the full potential of this new paradigm, efficient techniques for data analysis and statistical modeling in hyperbolic spaces are necessary. In the present paper rigorous mathematical framework for clustering in hyperbolic spaces is established. First, we introduce the -means clustering in hyperbolic balls, based on the novel definition of barycenter. Second, we present the expectation-maximization (EM) algorithm for learning mixtures of novel probability distributions in hyperbolic balls. In such a way we lay the foundation of unsupervised learning in hyperbolic spaces.

Paper Structure

This paper contains 23 sections, 23 equations, 13 figures.

Figures (13)

  • Figure 1: Barycenters of configurations of points in the Poincaré disc
  • Figure 2: Weighted barycenters of configurations of points in the Poincaré disc (two panels)
  • Figure 3: Random samples in Poincaré disc: a) $s=3$, $a=0$; b) $s=5$, $a=0$; c) $s=7$, $a=0$; d) $s=3$, $a=0.9e^{i\frac{\pi}{4}}$; e) $s=5$, $a=0.9e^{i\frac{\pi}{4}}$, and f) $s=7$, $a=0.9e^{i\frac{\pi}{4}}$.
  • Figure 4: Experiment A1 (three panels): a) ground truth; b) k-means; c) EM algorithm
  • Figure 5: Log-likelihood for EM in Experiment A1
  • ...and 8 more figures

Theorems & Definitions (7)

  • Definition 1
  • Definition 2
  • Remark 1
  • Definition 3
  • Definition 4
  • Remark 2
  • Remark 3