Table of Contents
Fetching ...

Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps

Rebecca Pattichis, Marios S. Pattichis

TL;DR

Techniques from Linear Algebra are introduced to model neural network layers as maps between signal spaces to study invertible networks using vector spaces for computing input images that yield specific outputs.

Abstract

There is strong interest in developing mathematical methods that can be used to understand complex neural networks used in image analysis. In this paper, we introduce techniques from Linear Algebra to model neural network layers as maps between signal spaces. First, we demonstrate how signal spaces can be used to visualize weight spaces and convolutional layer kernels. We also demonstrate how residual vector spaces can be used to further visualize information lost at each layer. Second, we study invertible networks using vector spaces for computing input images that yield specific outputs. We demonstrate our approach on two invertible networks and ResNet18.

Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps

TL;DR

Techniques from Linear Algebra are introduced to model neural network layers as maps between signal spaces to study invertible networks using vector spaces for computing input images that yield specific outputs.

Abstract

There is strong interest in developing mathematical methods that can be used to understand complex neural networks used in image analysis. In this paper, we introduce techniques from Linear Algebra to model neural network layers as maps between signal spaces. First, we demonstrate how signal spaces can be used to visualize weight spaces and convolutional layer kernels. We also demonstrate how residual vector spaces can be used to further visualize information lost at each layer. Second, we study invertible networks using vector spaces for computing input images that yield specific outputs. We demonstrate our approach on two invertible networks and ResNet18.
Paper Structure (10 sections, 14 equations, 3 figures)

This paper contains 10 sections, 14 equations, 3 figures.

Figures (3)

  • Figure 1: Vector spaces for a single-layer fully connected neural network applied to the MNIST digits dataset. The top row represents the original weights. The middle row represents the signal space: $\sigma_0 v_0, \dots, \sigma_9 v_9$. The condition number is 7.22. The last row represents the residual vectors when the network is applied to the average of each digit class (see equation (\ref{['eq:residual']})).
  • Figure 2: The signal space for the first 2D convolution layer in the first Sequential layer of ResNet fine-tuned for MNIST classification (99% accuracy, 1.07 condition number).
  • Figure 3: Generated ideal input images for each digit using different algorithms (see section \ref{['sec:inverse']}). Top row: 1-layer FCNN (92% accuracy): avg-img+training for 0, 1, 4, 5, 6, 8, and 9; min-img+training for 2, 3, and 7. Middle row: 5-layer FCNN (97% accuracy): avg-img+training for 1, 2, 6, and 7; min-img+training for 0, 3, 4, 5, 8, and 9. Bottom row: ResNet128 (99% accuracy): avg-img for 1, 4, 5, 6, 7, and 9; min-img for 0, 2, 3, and 8 that look binarized; avg-min-img for 4.