Table of Contents
Fetching ...

Identification and Uses of Deep Learning Backbones via Pattern Mining

Michael Livanos, Ian Davidson

TL;DR

This work explores the notion of identifying a backbone of deep learning for a given group of instances, and formulates this problem as a set cover style problem and shows it is intractable and presents a highly constrained integer linear programming (ILP) formulation.

Abstract

Deep learning is extensively used in many areas of data mining as a black-box method with impressive results. However, understanding the core mechanism of how deep learning makes predictions is a relatively understudied problem. Here we explore the notion of identifying a backbone of deep learning for a given group of instances. A group here can be instances of the same class or even misclassified instances of the same class. We view each instance for a given group as activating a subset of neurons and attempt to find a subgraph of neurons associated with a given concept/group. We formulate this problem as a set cover style problem and show it is intractable and presents a highly constrained integer linear programming (ILP) formulation. As an alternative, we explore a coverage-based heuristic approach related to pattern mining, and show it converges to a Pareto equilibrium point of the ILP formulation. Experimentally we explore these backbones to identify mistakes and improve performance, explanation, and visualization. We demonstrate application-based results using several challenging data sets, including Bird Audio Detection (BAD) Challenge and Labeled Faces in the Wild (LFW), as well as the classic MNIST data.

Identification and Uses of Deep Learning Backbones via Pattern Mining

TL;DR

This work explores the notion of identifying a backbone of deep learning for a given group of instances, and formulates this problem as a set cover style problem and shows it is intractable and presents a highly constrained integer linear programming (ILP) formulation.

Abstract

Deep learning is extensively used in many areas of data mining as a black-box method with impressive results. However, understanding the core mechanism of how deep learning makes predictions is a relatively understudied problem. Here we explore the notion of identifying a backbone of deep learning for a given group of instances. A group here can be instances of the same class or even misclassified instances of the same class. We view each instance for a given group as activating a subset of neurons and attempt to find a subgraph of neurons associated with a given concept/group. We formulate this problem as a set cover style problem and show it is intractable and presents a highly constrained integer linear programming (ILP) formulation. As an alternative, we explore a coverage-based heuristic approach related to pattern mining, and show it converges to a Pareto equilibrium point of the ILP formulation. Experimentally we explore these backbones to identify mistakes and improve performance, explanation, and visualization. We demonstrate application-based results using several challenging data sets, including Bird Audio Detection (BAD) Challenge and Labeled Faces in the Wild (LFW), as well as the classic MNIST data.
Paper Structure (10 sections, 3 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 10 sections, 3 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Potential issues with taking the union and intersection of activation vectors. In this example, there are eight neurons in the network and five instances in the concept. Neurons $n_2, n_4, n_5,$ and $n_8$ form the clearest summaries, occurring in 80% of the instances and the other neurons in only 20%. The intersection is empty since it requires neurons to be present in all instances, and the union is the whole network since it requires neurons to be present in only once.
  • Figure 2: A visualization of the matrix of node activations $N$ as a series of transactions with columns as different neurons and rows as instances. Color corresponds to patterns, and groups of neurons are labeled. FMM only finds group A, but ignores everything else. F-Score thresholding allows groups B and C to be included in the backbone despite having lower support than max minsup. Groups D and E have much lower support, so they will not be included.
  • Figure 3: Flow diagram of the process of flagging mispredictions and correcting them using the collective backbone and the prediction of the network.
  • Figure 4: Quantifying coverage and overlap difference between the relaxed ILP and heuristic. For both datasets, the top line represents the maximum (across folds) for that metric, the middle the median, and the lower the minimum. Coverage increases over iterations while overlap minimally increases.
  • Figure 5: The network, backbone as a predictive model, and the explanation augmented predictor accuracy on BAD test data. When used as a predictive device, the backbone underperforms the network, as expected, however when one considers both the backbone and the output of the network, as one does in the EAP, accuracy is increased significantly.
  • ...and 1 more figures