Table of Contents
Fetching ...

The Theory behind UMAP?

David Wegmann

TL;DR

An explicit description of the metric realization and related functors is contributed and claims about properties of the algorithm and the correspondence of McInnes et al.'s finite variant to the UMAP algorithm are discussed.

Abstract

In 2018, McInnes et al. introduced a dimensionality reduction algorithm called UMAP, which enjoys wide popularity among data scientists. Their work introduces a finite variant of a functor called the metric realization, based on an unpublished draft by Spivak. This draft contains many errors, most of which are reproduced by McInnes et al. and subsequent publications. This article aims to repair these errors and provide a self-contained document with the full derivation of Spivak's functors and McInnes et al.'s finite variant. We contribute an explicit description of the metric realization and related functors. At the end, we discuss the UMAP algorithm, as well as claims about properties of the algorithm and the correspondence of McInnes et al.'s finite variant to the UMAP algorithm.

The Theory behind UMAP?

TL;DR

An explicit description of the metric realization and related functors is contributed and claims about properties of the algorithm and the correspondence of McInnes et al.'s finite variant to the UMAP algorithm are discussed.

Abstract

In 2018, McInnes et al. introduced a dimensionality reduction algorithm called UMAP, which enjoys wide popularity among data scientists. Their work introduces a finite variant of a functor called the metric realization, based on an unpublished draft by Spivak. This draft contains many errors, most of which are reproduced by McInnes et al. and subsequent publications. This article aims to repair these errors and provide a self-contained document with the full derivation of Spivak's functors and McInnes et al.'s finite variant. We contribute an explicit description of the metric realization and related functors. At the end, we discuss the UMAP algorithm, as well as claims about properties of the algorithm and the correspondence of McInnes et al.'s finite variant to the UMAP algorithm.
Paper Structure (69 sections, 45 theorems, 137 equations)

This paper contains 69 sections, 45 theorems, 137 equations.

Key Result

Lemma 2.1.1

Let $\mathcal{A}, \mathcal{B}, \mathcal{C}$ be categories and let $\mathcal{S}$ be a full subcategory of $\mathcal{C}^\mathcal{B}$. Then there is an isomorphism of categories where $\mathcal{T}$ is the full subcategory of $\mathcal{C}^{\mathcal{A} \times \mathcal{B}}$ that has as objects all functors $F : \mathcal{A} \times \mathcal{B} \rightarrow \mathcal{C}$ such that $F(A,-) : \mathcal{B} \rig

Theorems & Definitions (115)

  • Lemma 2.1.1
  • Definition 2.2.1
  • Lemma 2.2.2
  • Definition 2.3.1
  • Theorem 2.3.2
  • Corollary 2.3.3
  • Definition 2.3.4
  • Lemma 2.3.5
  • Remark 2.3.6
  • Remark 2.3.7
  • ...and 105 more