Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

Alireza Ganjdanesh; Shangqian Gao; Heng Huang

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

Alireza Ganjdanesh, Shangqian Gao, Heng Huang

TL;DR

A novel structural pruning approach to jointly learn the weights and structurally prune architectures of CNN models with a Reinforcement Learning agent whose actions determine the pruning ratios of the CNN model's layers, and the resulting model's accuracy serves as its reward.

Abstract

Structural model pruning is a prominent approach used for reducing the computational cost of Convolutional Neural Networks (CNNs) before their deployment on resource-constrained devices. Yet, the majority of proposed ideas require a pretrained model before pruning, which is costly to secure. In this paper, we propose a novel structural pruning approach to jointly learn the weights and structurally prune architectures of CNN models. The core element of our method is a Reinforcement Learning (RL) agent whose actions determine the pruning ratios of the CNN model's layers, and the resulting model's accuracy serves as its reward. We conduct the joint training and pruning by iteratively training the model's weights and the agent's policy, and we regularize the model's weights to align with the selected structure by the agent. The evolving model's weights result in a dynamic reward function for the agent, which prevents using prominent episodic RL methods with stationary environment assumption for our purpose. We address this challenge by designing a mechanism to model the complex changing dynamics of the reward function and provide a representation of it to the RL agent. To do so, we take a learnable embedding for each training epoch and employ a recurrent model to calculate a representation of the changing environment. We train the recurrent model and embeddings using a decoder model to reconstruct observed rewards. Such a design empowers our agent to effectively leverage episodic observations along with the environment representations to learn a proper policy to determine performant sub-networks of the CNN model. Our extensive experiments on CIFAR-10 and ImageNet using ResNets and MobileNets demonstrate the effectiveness of our method.

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

TL;DR

Abstract

Paper Structure (17 sections, 11 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 17 sections, 11 equations, 2 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Method
Notations
Iterative Weight Training and Compression
Modeling the Dynamic Nature of Rewards
RL Agent Training
Soft Regularization of the Model's Weights
Experiments
CIFAR-10 Results
ImageNet Results
Ablation Studies
Conclusion
Bounding our Agent's Actions
Implementation of our Agent's Actions
...and 2 more sections

Figures (2)

Figure 1: Overview of our method. We jointly train and prune a CNN model using an RL agent by iteratively training the agent's policy and model's weights. In each iteration, we train the model's weights for one epoch and perform several episodic observations of the agent. Left: Each action of our agent prunes one layer of the model, and the procedure of pruning the $l$-th layer is shown. The agent's actions on the previous layers and the remaining layers' FLOPs determine its state, and we take the resulting model's accuracy as its reward (Sec. \ref{['iterative-training']}). As the model's weights change between iterations, the reward function also changes accordingly. Thus, we map each epoch to an embedding and employ a recurrent model to provide a state of the environment $z$ to the agent. (Sec. \ref{['dynamic-rewards']}) Right: Given a sub-network selected by the agent, we train the model's weights while softly regularizing them to align with the selected structure (Sec. \ref{['soft-reg']}).
Figure 2: Results of ablation experiments on CIFAR-10. (a-c) Best reward of our agent when using a different number of episodes per epoch for three pruning rates when pruning ResNet-56. (d-f) Best reward with/without using our mechanism to provide representations of the environment to our agent during training for three pruning rates for ResNet-56. (g-i) Same results of (d-f) for MobileNet-V2.

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

TL;DR

Abstract

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

Authors

TL;DR

Abstract

Table of Contents

Figures (2)