Table of Contents
Fetching ...

Topological Learning in Multi-Class Data Sets

Christopher Griffin, Trevor Karn, Benjamin Apple

TL;DR

It is hypothesize that topological complexity is negatively correlated with the ability of a fully connected feedforward deep neural network to learn to classify data correctly and validated the relationship between topological complexity and learning in DNN's on multiple data sets.

Abstract

We specialize techniques from topological data analysis to the problem of characterizing the topological complexity (as defined in the body of the paper) of a multi-class data set. As a by-product, a topological classifier is defined that uses an open sub-covering of the data set. This sub-covering can be used to construct a simplicial complex whose topological features (e.g., Betti numbers) provide information about the classification problem. We use these topological constructs to study the impact of topological complexity on learning in feedforward deep neural networks (DNNs). We hypothesize that topological complexity is negatively correlated with the ability of a fully connected feedforward deep neural network to learn to classify data correctly. We evaluate our topological classification algorithm on multiple constructed and open source data sets. We also validate our hypothesis regarding the relationship between topological complexity and learning in DNN's on multiple data sets.

Topological Learning in Multi-Class Data Sets

TL;DR

It is hypothesize that topological complexity is negatively correlated with the ability of a fully connected feedforward deep neural network to learn to classify data correctly and validated the relationship between topological complexity and learning in DNN's on multiple data sets.

Abstract

We specialize techniques from topological data analysis to the problem of characterizing the topological complexity (as defined in the body of the paper) of a multi-class data set. As a by-product, a topological classifier is defined that uses an open sub-covering of the data set. This sub-covering can be used to construct a simplicial complex whose topological features (e.g., Betti numbers) provide information about the classification problem. We use these topological constructs to study the impact of topological complexity on learning in feedforward deep neural networks (DNNs). We hypothesize that topological complexity is negatively correlated with the ability of a fully connected feedforward deep neural network to learn to classify data correctly. We evaluate our topological classification algorithm on multiple constructed and open source data sets. We also validate our hypothesis regarding the relationship between topological complexity and learning in DNN's on multiple data sets.
Paper Structure (13 sections, 20 equations, 18 figures, 17 tables, 3 algorithms)

This paper contains 13 sections, 20 equations, 18 figures, 17 tables, 3 algorithms.

Figures (18)

  • Figure 1: (Top) An illustration of a data set and two manifolds with a highly nonlinear boundary. (Bottom) The simplicial complex generated for Class 1.
  • Figure 2: The size of the cover increases monotonically as the complexity of the boundary between the two classes increases.
  • Figure 3: (Top) The visualization of the joint simplicial complex of all topological covers and the histograms of the radii of the covering sets. (Bottom) Visualization of the TSNE dimensional reduction. Blue is class 0, red is class 1, green is class 2.
  • Figure 4: Confusion matrices for the waveform data set. (Top Left) Topological classifier. (Top Right) Random forest classifier. (Bottom Left) Deep neural network classifier. (Bottom Right) Shallow neural network classifier.
  • Figure 5: A visualization of the MNIST data using the simplicial complex models of the manifolds.
  • ...and 13 more figures