Table of Contents
Fetching ...

MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights

Van Thien Nguyen, William Guicquero, Gilles Sicard

TL;DR

MOGNET addresses the challenge of deploying CNNs on resource-constrained hardware by introducing a hardware-friendly architecture that combines a MUX-based residual block, a lightweight convolution factorization with on-line generated weights via Cellular Automata, and balanced ternary quantization. The key innovations are the MRB for efficient skip connections, CFLOG for parameter-efficient convolution factorization with CA-generated final weights, and BTQ to maintain balanced multi-level quantization during training. Empirical results on CIFAR-10 and CIFAR-100 show that, within a sub-2Mb memory budget and using $3$-bit activations, MOGNET can achieve competitive or superior accuracy compared to state-of-the-art compact models, with notable gains on CIFAR-100. These findings suggest a practical path toward edge- and ASIC-friendly CNNs that maintain accuracy while drastically reducing memory and computation requirements.

Abstract

This paper presents a compact model architecture called MOGNET, compatible with a resource-limited hardware. MOGNET uses a streamlined Convolutional factorization block based on a combination of 2 point-wise (1x1) convolutions with a group-wise convolution in-between. To further limit the overall model size and reduce the on-chip required memory, the second point-wise convolution's parameters are on-line generated by a Cellular Automaton structure. In addition, MOGNET enables the use of low-precision weights and activations, by taking advantage of a Multiplexer mechanism with a proper Bitshift rescaling for integrating residual paths without increasing the hardware-related complexity. To efficiently train this model we also introduce a novel weight ternarization method favoring the balance between quantized levels. Experimental results show that given tiny memory budget (sub-2Mb), MOGNET can achieve higher accuracy with a clear gap up to 1% at a similar or even lower model size compared to recent state-of-the-art methods.

MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights

TL;DR

MOGNET addresses the challenge of deploying CNNs on resource-constrained hardware by introducing a hardware-friendly architecture that combines a MUX-based residual block, a lightweight convolution factorization with on-line generated weights via Cellular Automata, and balanced ternary quantization. The key innovations are the MRB for efficient skip connections, CFLOG for parameter-efficient convolution factorization with CA-generated final weights, and BTQ to maintain balanced multi-level quantization during training. Empirical results on CIFAR-10 and CIFAR-100 show that, within a sub-2Mb memory budget and using -bit activations, MOGNET can achieve competitive or superior accuracy compared to state-of-the-art compact models, with notable gains on CIFAR-100. These findings suggest a practical path toward edge- and ASIC-friendly CNNs that maintain accuracy while drastically reducing memory and computation requirements.

Abstract

This paper presents a compact model architecture called MOGNET, compatible with a resource-limited hardware. MOGNET uses a streamlined Convolutional factorization block based on a combination of 2 point-wise (1x1) convolutions with a group-wise convolution in-between. To further limit the overall model size and reduce the on-chip required memory, the second point-wise convolution's parameters are on-line generated by a Cellular Automaton structure. In addition, MOGNET enables the use of low-precision weights and activations, by taking advantage of a Multiplexer mechanism with a proper Bitshift rescaling for integrating residual paths without increasing the hardware-related complexity. To efficiently train this model we also introduce a novel weight ternarization method favoring the balance between quantized levels. Experimental results show that given tiny memory budget (sub-2Mb), MOGNET can achieve higher accuracy with a clear gap up to 1% at a similar or even lower model size compared to recent state-of-the-art methods.
Paper Structure (10 sections, 6 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 10 sections, 6 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Top-level architecture description of MOGNET with Convolutional Factorization Leveraging On-line Generated weights (CFLOG) and MUX Residual Block (MRB). The final 1$\times$1 convolution is followed by Batch Normalization (BN) prior to a Global Average Pooling (GAP). Here $n, m$ are the parameters controlling the number of output feature maps and the latent dimension in CFLOG, MP stands for 2$\times$2 Max Pooling and g-GConv is Grouped Convolution with g groups.
  • Figure 2: CFLOG description with CA-generated weights.
  • Figure 3: Balanced ternary quantization with histogram bin equalization when 2 tertiles ($q_1, q_2$) are symmetrical and coincide with the quantization thresholds ($-\frac{s}{2}, \frac{s}{2}$).
  • Figure 4: Test accuracy of different compression method-model couplings. Our models are with 3-b activations.