Table of Contents
Fetching ...

Motif distribution and function of sparse deep neural networks

Olivia T. Zahn, Thomas L. Daniel, J. Nathan Kutz

TL;DR

It is shown that, despite random initialization of network parameters, enforced sparsity causes DNNs to converge to similar connectivity patterns as characterized by their motif distributions, suggesting how neural network function can be encoded in motif distributions, suggesting a variety of experiments for informing function and control.

Abstract

We characterize the connectivity structure of feed-forward, deep neural networks (DNNs) using network motif theory. To address whether a particular motif distribution is characteristic of the training task, or function of the DNN, we compare the connectivity structure of 350 DNNs trained to simulate a bio-mechanical flight control system with different randomly initialized parameters. We develop and implement algorithms for counting second- and third-order motifs and calculate their significance using their Z-score. The DNNs are trained to solve the inverse problem of the flight dynamics model in Bustamante, et al. (2022) (i.e., predict the controls necessary for controlled flight from the initial and final state-space inputs) and are sparsified through an iterative pruning and retraining algorithm Zahn, et al. (2022). We show that, despite random initialization of network parameters, enforced sparsity causes DNNs to converge to similar connectivity patterns as characterized by their motif distributions. The results suggest how neural network function can be encoded in motif distributions, suggesting a variety of experiments for informing function and control.

Motif distribution and function of sparse deep neural networks

TL;DR

It is shown that, despite random initialization of network parameters, enforced sparsity causes DNNs to converge to similar connectivity patterns as characterized by their motif distributions, suggesting how neural network function can be encoded in motif distributions, suggesting a variety of experiments for informing function and control.

Abstract

We characterize the connectivity structure of feed-forward, deep neural networks (DNNs) using network motif theory. To address whether a particular motif distribution is characteristic of the training task, or function of the DNN, we compare the connectivity structure of 350 DNNs trained to simulate a bio-mechanical flight control system with different randomly initialized parameters. We develop and implement algorithms for counting second- and third-order motifs and calculate their significance using their Z-score. The DNNs are trained to solve the inverse problem of the flight dynamics model in Bustamante, et al. (2022) (i.e., predict the controls necessary for controlled flight from the initial and final state-space inputs) and are sparsified through an iterative pruning and retraining algorithm Zahn, et al. (2022). We show that, despite random initialization of network parameters, enforced sparsity causes DNNs to converge to similar connectivity patterns as characterized by their motif distributions. The results suggest how neural network function can be encoded in motif distributions, suggesting a variety of experiments for informing function and control.
Paper Structure (19 sections, 6 equations, 9 figures, 9 algorithms)

This paper contains 19 sections, 6 equations, 9 figures, 9 algorithms.

Figures (9)

  • Figure 1: Top: A densely connected DNN is trained to predict the control variables for the task of insect hovering. Initial and final state-space variables are used as inputs to the network. The trained network is pruned to maximal sparsity with little decrease in performance. Middle: A subset of 2nd- and 3rd-order network sub-graphs that can exist in a feed-forward DNN. Bottom: Training and validation loss over 350 networks pruned to different sparsity levels.
  • Figure 2: Distributions of z-scores across 350 DNNs pruned to 98% sparsity. Top axis shows the motif, left axis shows the standard deviations from the mean (or z-score), and the right axis shows the cumulative percentage.
  • Figure 3: Z-score distributions across sparsity levels. Each panel shows how the z-score of the pictured motif changes throughout the pruning process. The bottom right panel shows the test MSE across all 350 networks at increasing levels of sparsity.
  • Figure 4: Left: Example of two-layer sparse network with inputs, $\vec{x}$, and outputs, $\vec{y}$. Center: Forward pass computation where $\mathbf{W}$ represents the weight matrix $\mathbf{M}$ represents the mask matrix, and $\sigma$ represents the nonlinear activation function (bias is excluded for simplicity). Right: Example of weight matrix $\mathbf{W}$ and mask matrix $\mathbf{M}$.
  • Figure 5: Left: Example of sparse feed-forward network with inputs, $\vec{x}$, outputs, $\vec{y}$, and one hidden layer, $\vec{h}_1$. Middle: Same network with second-order chain sub-graphs highlighted. Right: Masks representing the connectivity of the network between the layers (e.g., $\mathbf{M_{x,h_1}}$ for the weights between layers $\vec{x}$ and $\vec{h}_1$).
  • ...and 4 more figures