Topological Learning in Multi-Class Data Sets

Christopher Griffin; Trevor Karn; Benjamin Apple

Topological Learning in Multi-Class Data Sets

Christopher Griffin, Trevor Karn, Benjamin Apple

TL;DR

It is hypothesize that topological complexity is negatively correlated with the ability of a fully connected feedforward deep neural network to learn to classify data correctly and validated the relationship between topological complexity and learning in DNN's on multiple data sets.

Abstract

We specialize techniques from topological data analysis to the problem of characterizing the topological complexity (as defined in the body of the paper) of a multi-class data set. As a by-product, a topological classifier is defined that uses an open sub-covering of the data set. This sub-covering can be used to construct a simplicial complex whose topological features (e.g., Betti numbers) provide information about the classification problem. We use these topological constructs to study the impact of topological complexity on learning in feedforward deep neural networks (DNNs). We hypothesize that topological complexity is negatively correlated with the ability of a fully connected feedforward deep neural network to learn to classify data correctly. We evaluate our topological classification algorithm on multiple constructed and open source data sets. We also validate our hypothesis regarding the relationship between topological complexity and learning in DNN's on multiple data sets.

Topological Learning in Multi-Class Data Sets

TL;DR

Abstract

Paper Structure (13 sections, 20 equations, 18 figures, 17 tables, 3 algorithms)

This paper contains 13 sections, 20 equations, 18 figures, 17 tables, 3 algorithms.

Introduction
Topological Features for of Multi-Class Data
Classification with the Topological Cover
Results on Topological Classification
Complex Boundaries in Two Dimensions
Waveform Generator
MNIST
HEPMASS
Undersea Acoustic Data Set
Topological Implications for Learning Neural Networks: Tile Model
Topological Implications for Learning a Children's Game
Conclusion and Future Directions
Alternate Covering Algorithm

Figures (18)

Figure 1: (Top) An illustration of a data set and two manifolds with a highly nonlinear boundary. (Bottom) The simplicial complex generated for Class 1.
Figure 2: The size of the cover increases monotonically as the complexity of the boundary between the two classes increases.
Figure 3: (Top) The visualization of the joint simplicial complex of all topological covers and the histograms of the radii of the covering sets. (Bottom) Visualization of the TSNE dimensional reduction. Blue is class 0, red is class 1, green is class 2.
Figure 4: Confusion matrices for the waveform data set. (Top Left) Topological classifier. (Top Right) Random forest classifier. (Bottom Left) Deep neural network classifier. (Bottom Right) Shallow neural network classifier.
Figure 5: A visualization of the MNIST data using the simplicial complex models of the manifolds.
...and 13 more figures

Topological Learning in Multi-Class Data Sets

TL;DR

Abstract

Topological Learning in Multi-Class Data Sets

Authors

TL;DR

Abstract

Table of Contents

Figures (18)