Table of Contents
Fetching ...

Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts

Mary Isabelle Wisell, Salimeh Yasaei Sekeh

TL;DR

The theoretical foundations for GC-Net’s approach to improving generalization under distribution shifts are provided, and experimental results using common DNN benchmarks show promising results for hybridizing the method, and using GC-Net guidance for later layers of a network and direct pruning on earlier layers.

Abstract

Sparse deep neural networks (DNNs) excel in real-world applications like robotics and computer vision, by reducing computational demands that hinder usability. However, recent studies aim to boost DNN efficiency by trimming redundant neurons or filters based on task relevance, but neglect their adaptability to distribution shifts. We aim to enhance these existing techniques by introducing a companion network, Ghost Connect-Net (GC-Net), to monitor the connections in the original network with distribution generalization advantage. GC-Net's weights represent connectivity measurements between consecutive layers of the original network. After pruning GC-Net, the pruned locations are mapped back to the original network as pruned connections, allowing for the combination of magnitude and connectivity-based pruning methods. Experimental results using common DNN benchmarks, such as CIFAR-10, Fashion MNIST, and Tiny ImageNet show promising results for hybridizing the method, and using GC-Net guidance for later layers of a network and direct pruning on earlier layers. We provide theoretical foundations for GC-Net's approach to improving generalization under distribution shifts.

Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts

TL;DR

The theoretical foundations for GC-Net’s approach to improving generalization under distribution shifts are provided, and experimental results using common DNN benchmarks show promising results for hybridizing the method, and using GC-Net guidance for later layers of a network and direct pruning on earlier layers.

Abstract

Sparse deep neural networks (DNNs) excel in real-world applications like robotics and computer vision, by reducing computational demands that hinder usability. However, recent studies aim to boost DNN efficiency by trimming redundant neurons or filters based on task relevance, but neglect their adaptability to distribution shifts. We aim to enhance these existing techniques by introducing a companion network, Ghost Connect-Net (GC-Net), to monitor the connections in the original network with distribution generalization advantage. GC-Net's weights represent connectivity measurements between consecutive layers of the original network. After pruning GC-Net, the pruned locations are mapped back to the original network as pruned connections, allowing for the combination of magnitude and connectivity-based pruning methods. Experimental results using common DNN benchmarks, such as CIFAR-10, Fashion MNIST, and Tiny ImageNet show promising results for hybridizing the method, and using GC-Net guidance for later layers of a network and direct pruning on earlier layers. We provide theoretical foundations for GC-Net's approach to improving generalization under distribution shifts.

Paper Structure

This paper contains 20 sections, 27 equations, 4 figures, 42 tables, 2 algorithms.

Figures (4)

  • Figure 1: GC-Net Guidance Overview: (Blue module) Generate activation matrices (step 1), create connectivity matrices via Pearson Correlation (step 2), expand matrices as GC-Net weights (step 3). (Red module) Prune GC-Net. (Green module) Map pruned GC-Net to original network. (Yellow module) Prune remaining original network layers.
  • Figure 2: When the outputs of two layers converge into a single subsequent layer, their respective connectivity matrices are added. This combined connectivity matrix is then treated as a single matrix and loaded as weights for the target layer in GC-Net.
  • Figure 3: VGG16-BN Hybrid: Part B is pruned directly in original network $F_O$ and Part A is pruned by guiding GC-Net $F_{GC}$. Models (1-4, from left) show: 1) Full GC-Net: the first layer is pruned directly and all other layers are pruned with GC-Net 2) CG-Net - FH: the last 50% is pruned directly and the first 50% is pruned with GC-Net 3) GC-Net - BH: the first 50% is pruned directly and the last 50% is pruned with GC-Net 4) GC-Net-B25%: the first 75% is pruned directly and the last 25% is pruned with GC-Net
  • Figure 4: Comparison of ResNet-18 OS-SynFlow FLOPs for original and hybrid methods using different similarity metrics for CIFAR-10.