Table of Contents
Fetching ...

Manifold Learning via Foliations and Knowledge Transfer

E. Tron, E. Fioresi

TL;DR

Addresses how to represent data geometry in high dimensions by using a CNN-trained classifier to induce a foliation on data space via the Data Information Matrix $D(x,w)$. The data-space distribution $\mathcal{D}_x = \mathrm{span}_{i}\{\nabla_x \log p_i(y|x,w)\}$ leads to a learning foliation that exists almost everywhere due to singular points forming a measure-zero set, and Frobenius-type integrability provides leaves. Empirically, leaves align with real data and moving along a leaf preserves meaningful predictions, while orthogonal directions degrade accuracy; the spectrum of $D(x,w)$ serves as a distance proxy between datasets and informs transfer by comparing DIM eigenvalues across datasets. This framework extends beyond traditional manifold assumptions, offering a geometric, information-theoretic approach to dimensionality reduction and transfer in deep classifiers.

Abstract

Understanding how real data is distributed in high dimensional spaces is the key to many tasks in machine learning. We want to provide a natural geometric structure on the space of data employing a deep ReLU neural network trained as a classifier. Through the data information matrix (DIM), a variation of the Fisher information matrix, the model will discern a singular foliation structure on the space of data. We show that the singular points of such foliation are contained in a measure zero set, and that a local regular foliation exists almost everywhere. Experiments show that the data is correlated with leaves of such foliation. Moreover we show the potential of our approach for knowledge transfer by analyzing the spectrum of the DIM to measure distances between datasets.

Manifold Learning via Foliations and Knowledge Transfer

TL;DR

Addresses how to represent data geometry in high dimensions by using a CNN-trained classifier to induce a foliation on data space via the Data Information Matrix . The data-space distribution leads to a learning foliation that exists almost everywhere due to singular points forming a measure-zero set, and Frobenius-type integrability provides leaves. Empirically, leaves align with real data and moving along a leaf preserves meaningful predictions, while orthogonal directions degrade accuracy; the spectrum of serves as a distance proxy between datasets and informs transfer by comparing DIM eigenvalues across datasets. This framework extends beyond traditional manifold assumptions, offering a geometric, information-theoretic approach to dimensionality reduction and transfer in deep classifiers.

Abstract

Understanding how real data is distributed in high dimensional spaces is the key to many tasks in machine learning. We want to provide a natural geometric structure on the space of data employing a deep ReLU neural network trained as a classifier. Through the data information matrix (DIM), a variation of the Fisher information matrix, the model will discern a singular foliation structure on the space of data. We show that the singular points of such foliation are contained in a measure zero set, and that a local regular foliation exists almost everywhere. Experiments show that the data is correlated with leaves of such foliation. Moreover we show the potential of our approach for knowledge transfer by analyzing the spectrum of the DIM to measure distances between datasets.
Paper Structure (8 sections, 6 theorems, 16 equations, 8 figures, 2 tables)

This paper contains 8 sections, 6 theorems, 16 equations, 8 figures, 2 tables.

Key Result

Proposition 3.1

The Fisher information matrix $F(x, w)$ and the data information matrix $D(x, w)$ are positive semidefinite symmetric matrices. Moreover: where the orthogonal is taken with respect to the euclidean product. In particular $\mathrm{rank}\ F(x,w) < c$ and $\mathrm{rank}\ D(x,w) < c$.

Figures (8)

  • Figure 1: Foliations: $\mathcal{L}_\mathcal{D}$ and $\mathcal{L}_{\mathcal{D}^\perp}$ denote the set of the leaves in distributions $\mathcal{D}$ and $\mathcal{D}^\perp$ respectively.
  • Figure 2: Moving according to $\mathcal{D}$.
  • Figure 3: Moving according to $\mathcal{D}^\perp$.
  • Figure 4: The learning foliation defined by the distribution $\mathcal{D}$ (\ref{['distr-def']}) for a Xor network.
  • Figure 5: The structure of our CNN - picture created with lenail.
  • ...and 3 more figures

Theorems & Definitions (10)

  • Proposition 3.1
  • Theorem 3.2
  • Proposition 3.3
  • Remark 3.1
  • Lemma 3.4
  • proof
  • Lemma 3.5
  • proof
  • Theorem 3.6
  • proof