Table of Contents
Fetching ...

Graph Expansion in Pruned Recurrent Neural Network Layers Preserve Performance

Suryam Arnav Kalra, Arindam Biswas, Pabitra Mitra, Biswajit Basu

TL;DR

This work investigates how expander graph theory can guide pruning of recurrent neural networks by preserving layer-wise expansion and spectral gaps. By modeling time-unfolded RNN/LSTM structures as bipartite graphs and enforcing Ramanujan-like bounds during iterative pruning, the authors demonstrate that maintaining strong combinatorial and spectral expansion preserves classification accuracy on sequential MNIST, CIFAR-10, and Google Speech Commands, even at high sparsity. The theoretical framework links Cheeger constants, eigenvalue gaps, and pruning limits, providing practical stopping criteria and a robust approach to lightweight, real-time sequence models. The findings highlight the importance of connectivity patterns, particularly in input-to-hidden connections, for sustaining performance under sparsity and noise.

Abstract

Expansion property of a graph refers to its strong connectivity as well as sparseness. It has been reported that deep neural networks can be pruned to a high degree of sparsity while maintaining their performance. Such pruning is essential for performing real time sequence learning tasks using recurrent neural networks in resource constrained platforms. We prune recurrent networks such as RNNs and LSTMs, maintaining a large spectral gap of the underlying graphs and ensuring their layerwise expansion properties. We also study the time unfolded recurrent network graphs in terms of the properties of their bipartite layers. Experimental results for the benchmark sequence MNIST, CIFAR-10, and Google speech command data show that expander graph properties are key to preserving classification accuracy of RNN and LSTM.

Graph Expansion in Pruned Recurrent Neural Network Layers Preserve Performance

TL;DR

This work investigates how expander graph theory can guide pruning of recurrent neural networks by preserving layer-wise expansion and spectral gaps. By modeling time-unfolded RNN/LSTM structures as bipartite graphs and enforcing Ramanujan-like bounds during iterative pruning, the authors demonstrate that maintaining strong combinatorial and spectral expansion preserves classification accuracy on sequential MNIST, CIFAR-10, and Google Speech Commands, even at high sparsity. The theoretical framework links Cheeger constants, eigenvalue gaps, and pruning limits, providing practical stopping criteria and a robust approach to lightweight, real-time sequence models. The findings highlight the importance of connectivity patterns, particularly in input-to-hidden connections, for sustaining performance under sparsity and noise.

Abstract

Expansion property of a graph refers to its strong connectivity as well as sparseness. It has been reported that deep neural networks can be pruned to a high degree of sparsity while maintaining their performance. Such pruning is essential for performing real time sequence learning tasks using recurrent neural networks in resource constrained platforms. We prune recurrent networks such as RNNs and LSTMs, maintaining a large spectral gap of the underlying graphs and ensuring their layerwise expansion properties. We also study the time unfolded recurrent network graphs in terms of the properties of their bipartite layers. Experimental results for the benchmark sequence MNIST, CIFAR-10, and Google speech command data show that expander graph properties are key to preserving classification accuracy of RNN and LSTM.
Paper Structure (15 sections, 6 equations, 8 figures, 1 table)

This paper contains 15 sections, 6 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Variation in test set accuracy (left vertical axis) and spectral gap ($\Delta_S$, $\Delta_R$) (right vertical axis) on sequence MNIST data for RNN with remaining edges percentage $q$, considering unweighted graph representation. The vertical lines shows the first zero crossing of $\Delta_S$, $\Delta_R$.
  • Figure 2: Variation in test set accuracy and spectral gap ($\Delta_S$) on sequence MNIST data for RNN with remaining edges percentage, considering weighted graph representation. The vertical line shows the first zero crossing of $\Delta_S$.
  • Figure 3: Variation in test set accuracy and spectral gap ($\Delta_S$, $\Delta_R$) on sequence MNIST data for LSTM with remaining edges percentage, considering unweighted graph representation. The vertical lines shows the first zero crossing of $\Delta_S$, $\Delta_R$.
  • Figure 4: Variation in test set accuracy and spectral gap ($\Delta_S$) on sequence MNIST data for LSTM with remaining edges percentage, considering weighted representation. The vertical line shows the first zero crossing of $\Delta_S$.
  • Figure 5: Variation in test set accuracy and spectral gap ($\Delta_S$, $\Delta_R$) on sequence CIFAR-10 data for LSTM with remaining edges percentage, considering unweighted graph representation. The vertical lines shows the first zero crossing of $\Delta_S$, $\Delta_R$.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Definition 1: Expander and Cheeger constant