Table of Contents
Fetching ...

Harnessing Data Asymmetry: Manifold Learning in the Finsler World

Thomas Dagès, Simon Weber, Daniel Cremers, Ron Kimmel

Abstract

Manifold learning is a fundamental task at the core of data analysis and visualisation. It aims to capture the simple underlying structure of complex high-dimensional data by preserving pairwise dissimilarities in low-dimensional embeddings. Traditional methods rely on symmetric Riemannian geometry, thus forcing symmetric dissimilarities and embedding spaces, e.g. Euclidean. However, this discards in practice valuable asymmetric information inherent to the non-uniformity of data samples. We suggest to harness this asymmetry by switching to Finsler geometry, an asymmetric generalisation of Riemannian geometry, and propose a Finsler manifold learning pipeline that constructs asymmetric dissimilarities and embeds in a Finsler space. This greatly broadens the applicability of existing asymmetric embedders beyond traditionally directed data to any data. We also modernise asymmetric embedders by generalising current reference methods to asymmetry, like Finsler t-SNE and Finsler Umap. On controlled synthetic and large real datasets, we show that our asymmetric pipeline reveals valuable information lost in the traditional pipeline, e.g. density hierarchies, and consistently provides superior quality embeddings than their Euclidean counterparts.

Harnessing Data Asymmetry: Manifold Learning in the Finsler World

Abstract

Manifold learning is a fundamental task at the core of data analysis and visualisation. It aims to capture the simple underlying structure of complex high-dimensional data by preserving pairwise dissimilarities in low-dimensional embeddings. Traditional methods rely on symmetric Riemannian geometry, thus forcing symmetric dissimilarities and embedding spaces, e.g. Euclidean. However, this discards in practice valuable asymmetric information inherent to the non-uniformity of data samples. We suggest to harness this asymmetry by switching to Finsler geometry, an asymmetric generalisation of Riemannian geometry, and propose a Finsler manifold learning pipeline that constructs asymmetric dissimilarities and embeds in a Finsler space. This greatly broadens the applicability of existing asymmetric embedders beyond traditionally directed data to any data. We also modernise asymmetric embedders by generalising current reference methods to asymmetry, like Finsler t-SNE and Finsler Umap. On controlled synthetic and large real datasets, we show that our asymmetric pipeline reveals valuable information lost in the traditional pipeline, e.g. density hierarchies, and consistently provides superior quality embeddings than their Euclidean counterparts.
Paper Structure (55 sections, 6 theorems, 41 equations, 17 figures, 7 tables)

This paper contains 55 sections, 6 theorems, 41 equations, 17 figures, 7 tables.

Key Result

Theorem 1

Let $\mathcal{L} = -\sum_{ij} p_{ij}\ln q_{ij}^F$ be the Finsler t-SNE loss with $q_{ij}^F$ from eq: finsler embedding dissimilarities. Denoting $t_{ij}^F \!=\! (1+\tfrac{(d_{ij}^F)^2}{\nu})^{-1}$ and $\delta_{pq^F}^{ij} \!=\! p_{ij} - q_{ij}^F$, then

Figures (17)

  • Figure 1: Motivation: how asymmetry can arise and why preserving it matters. We aim to recover the underlying smooth US manifold from US cities (latitude-longitude). Hidden external factors (e.g. mountain ranges) bias the sampling density: fewer cities lie in high-altitude regions. Reweighing distances by local density yields asymmetric dissimilarities that encode differences in geographical setting. Symmetrising and embedding in symmetric spaces (e.g. Isomap, t-SNE, Umap, Poincaré maps) discards this information. Our asymmetric dissimilarity construction can be fed to existing traditional asymmetric embedders (e.g. slide-vector, radius-distance), but these are heuristic and non-metric. Finsler geometry provides a principled metric framework, enabling the existing Finsler MDS dages2025finsler and our novel and more scalable Finsler t-SNE and Umap to better embed and reveal the hidden terrain.
  • Figure 2: The traditional manifold learning pipeline leads to asymmetric data dissimilarities on sampled data due to directed proximity graphs and local distance transforms. As this violates the Riemannian manifold assumption and is incompatible with symmetric Euclidean space embeddings, heuristic symmetrisation of data dissimilarities is required yet theoretically unjustified. We propose to equip the data manifold with a Finsler metric allowing asymmetric dissimilarities. Embeddings are then performed in a canonical Finsler space, enabling us not only to accurately capture the structure of the data but also harness and reveal the natural asymmetry of the sampling.
  • Figure 3: Left: binary asymmetry from absent reverse edges in directed proximity graphs. Right: non-binary asymmetry with reciprocal edges having differing distances from locally tweaking the metric and approximating geodesic with tangent space distances.
  • Figure 4: Metrics define distances in tangent spaces (left) via their convex unit tangent ball. Riemannian metrics -- whether isotropic or anisotropic -- remain symmetric due to symmetric unit tangent balls. Finsler metrics allow asymmetry, so geodesic paths and distances need not be symmetric (right). Courtesy of weber2024finslerdages2025finslerdages2025metric.
  • Figure 5: Toy planar data with non-uniform density, embedded with symmetric baselines and our Finsler methods using asymmetric dissimilarities. Variation in the $z$ coordinate reveals asymmetric distances that quantitatively encode density differences, while a top view ($xy$ only) preserves the manifold as in Isomap.
  • ...and 12 more figures

Theorems & Definitions (11)

  • Theorem 1: Finsler t-SNE
  • Theorem 2: Finsler Umap
  • proof
  • Theorem 3: Fixed t-SNE gradient
  • proof
  • Theorem 4
  • proof
  • Corollary 1
  • Theorem 5: Gradient canonical Finsler distances
  • proof : \ref{['th: finsler tsne update rule']} -- Finsler t-SNE
  • ...and 1 more