Table of Contents
Fetching ...

Competition-based Adaptive ReLU for Deep Neural Networks

Junjia Chen, Zhibin Pan

TL;DR

The paper addresses the asymmetry in traditional activations that often suppress negative values by introducing a competition-based activation, CAReLU. It defines Competition-based Adaptive Scaling (CAS) to modulate pre-activations with a smooth, trainable scaling $CAS(\mathbf{z}) = K\,\tanh(\alpha p + \beta)\,\mathbf{z}$, where $p\in\{p_E,p_{L1},p_c\}$ and $p_E = \frac{\sum_j [\max(z_j,0)]^2}{\|\mathbf{z}\|^2 + \epsilon}$, leading to CAReLU(\mathbf{z}) = ReLU(CAS(\mathbf{z})). The method uses two per-layer parameters $\alpha$ and $\beta$, provides continuous gradient behavior, and includes BN-aware variants to maintain competition under normalization. Empirically, CAReLU variants improve accuracy and PSNR across CIFAR-100, BSD-300, and SNLI tasks, with CAReLU_E frequently offering the strongest gains, demonstrating the practical value of leveraging positive–negative activation competition in deep networks.

Abstract

Activation functions introduce nonlinearity into deep neural networks. Most popular activation functions allow positive values to pass through while blocking or suppressing negative values. From the idea that positive values and negative values are equally important, and they must compete for activation, we proposed a new Competition-based Adaptive ReLU (CAReLU). CAReLU scales the input values based on the competition results between positive values and negative values. It defines two parameters to adjust the scaling strategy and can be trained uniformly with other network parameters. We verify the effectiveness of CAReLU on image classification, super-resolution, and natural language processing tasks. In the experiment, our method performs better than other widely used activation functions. In the case of replacing ReLU in ResNet-18 with our proposed activation function, it improves the classification accuracy on the CIFAR-100 dataset. The effectiveness and the new perspective on the utilization of competition results between positive values and negative values make CAReLU a promising activation function.

Competition-based Adaptive ReLU for Deep Neural Networks

TL;DR

The paper addresses the asymmetry in traditional activations that often suppress negative values by introducing a competition-based activation, CAReLU. It defines Competition-based Adaptive Scaling (CAS) to modulate pre-activations with a smooth, trainable scaling , where and , leading to CAReLU(\mathbf{z}) = ReLU(CAS(\mathbf{z})). The method uses two per-layer parameters and , provides continuous gradient behavior, and includes BN-aware variants to maintain competition under normalization. Empirically, CAReLU variants improve accuracy and PSNR across CIFAR-100, BSD-300, and SNLI tasks, with CAReLU_E frequently offering the strongest gains, demonstrating the practical value of leveraging positive–negative activation competition in deep networks.

Abstract

Activation functions introduce nonlinearity into deep neural networks. Most popular activation functions allow positive values to pass through while blocking or suppressing negative values. From the idea that positive values and negative values are equally important, and they must compete for activation, we proposed a new Competition-based Adaptive ReLU (CAReLU). CAReLU scales the input values based on the competition results between positive values and negative values. It defines two parameters to adjust the scaling strategy and can be trained uniformly with other network parameters. We verify the effectiveness of CAReLU on image classification, super-resolution, and natural language processing tasks. In the experiment, our method performs better than other widely used activation functions. In the case of replacing ReLU in ResNet-18 with our proposed activation function, it improves the classification accuracy on the CIFAR-100 dataset. The effectiveness and the new perspective on the utilization of competition results between positive values and negative values make CAReLU a promising activation function.
Paper Structure (10 sections, 16 equations, 1 figure, 4 tables)

This paper contains 10 sections, 16 equations, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Histograms of $\alpha p + \beta$ obtained from the best trained model. (a) ResNet-18/${\rm CAReLU_E}$. (b) GoogLeNet/${\rm BN\text{-}CAReLU_E}$. (c) VGG-13/${\rm BN\text{-}CAReLU_E}$.