Table of Contents
Fetching ...

RepAct: The Re-parameterizable Adaptive Activation Function

Xian Wu, Qingchuan Tao, Shuang Wang

TL;DR

RepAct introduces a re-parameterizable adaptive activation that trains with a multi-branch activation and re-parameterizes to a single branch at inference, enabling enhanced feature learning in lightweight networks without extra inference cost. It extends to RepAct-II with Softmax-based competition and RepAct-III with BN-based cooperation to regulate branch influence and incorporate global information. Across ImageNet100, CIFAR-100, and VOC12 benchmarks, RepAct delivers meaningful accuracy gains (up to 7.92% on MobileNetV3-Small) and improved detection/segmentation performance while maintaining low inference complexity. The approach demonstrates improved forward feature scaling and backward gradient propagation, supported by GradCAM analyses, and offers a practical path for deploying stronger edge-AI models with minimal latency or memory overhead.

Abstract

Addressing the imperative need for efficient artificial intelligence in IoT and edge computing, this study presents RepAct, a re-parameterizable adaptive activation function tailored for optimizing lightweight neural networks within the computational limitations of edge devices. By employing a multi-branch structure with learnable adaptive weights, RepAct enriches feature processing and enhances cross-layer interpretability. When evaluated on tasks such as image classification and object detection, RepAct notably surpassed conventional activation functions in lightweight networks, delivering up to a 7.92% accuracy boost on MobileNetV3-Small for the ImageNet100 dataset, while maintaining computational complexity on par with HardSwish. This innovative approach not only maximizes model parameter efficiency but also significantly improves the performance and understanding capabilities of lightweight neural networks, demonstrating its potential for real-time edge computing applications.

RepAct: The Re-parameterizable Adaptive Activation Function

TL;DR

RepAct introduces a re-parameterizable adaptive activation that trains with a multi-branch activation and re-parameterizes to a single branch at inference, enabling enhanced feature learning in lightweight networks without extra inference cost. It extends to RepAct-II with Softmax-based competition and RepAct-III with BN-based cooperation to regulate branch influence and incorporate global information. Across ImageNet100, CIFAR-100, and VOC12 benchmarks, RepAct delivers meaningful accuracy gains (up to 7.92% on MobileNetV3-Small) and improved detection/segmentation performance while maintaining low inference complexity. The approach demonstrates improved forward feature scaling and backward gradient propagation, supported by GradCAM analyses, and offers a practical path for deploying stronger edge-AI models with minimal latency or memory overhead.

Abstract

Addressing the imperative need for efficient artificial intelligence in IoT and edge computing, this study presents RepAct, a re-parameterizable adaptive activation function tailored for optimizing lightweight neural networks within the computational limitations of edge devices. By employing a multi-branch structure with learnable adaptive weights, RepAct enriches feature processing and enhances cross-layer interpretability. When evaluated on tasks such as image classification and object detection, RepAct notably surpassed conventional activation functions in lightweight networks, delivering up to a 7.92% accuracy boost on MobileNetV3-Small for the ImageNet100 dataset, while maintaining computational complexity on par with HardSwish. This innovative approach not only maximizes model parameter efficiency but also significantly improves the performance and understanding capabilities of lightweight neural networks, demonstrating its potential for real-time edge computing applications.
Paper Structure (18 sections, 16 equations, 18 figures, 1 table)

This paper contains 18 sections, 16 equations, 18 figures, 1 table.

Figures (18)

  • Figure 1: Schematic diagram of the single-branch structure of RepAct I
  • Figure 2: RepAct I Activation function and its first derivative(a) Initialize the RepAct on average(b) $\alpha_n$[0-1] The RepAct I after the fluctuation(c) $\alpha_n$[0-1] The RepAct I reciprocal image
  • Figure 3: Forward and backward diagram of neural network
  • Figure 4: CNN network structure and feature map sparse activation heat map
  • Figure 5: Schematic diagram of ReLU and RepAct (Identity and ReLU) backpropagation chain
  • ...and 13 more figures