DSReLU: A Novel Dynamic Slope Function for Superior Model Training

Archisman Chakraborti; Bidyut B Chaudhuri

DSReLU: A Novel Dynamic Slope Function for Superior Model Training

Archisman Chakraborti, Bidyut B Chaudhuri

TL;DR

The paper introduces DSReLU, a dynamic-slope activation that evolves with training to address vanishing gradients and overfitting in vision models. By defining f(x;t) with a time-dependent slope s(t) that transitions from a steep to a gentler regime via a logistic function, the approach aims to preserve learning speed early on while stabilizing later. Empirical results on Mini-ImageNet, CIFAR-100, and MIT-BIH using a ResNet-34 backbone show DSReLU achieving higher validation accuracy and F1-scores than ReLU, Mish, and other baselines, with competitive training times. The work suggests DSReLU as a promising activation design for improved learning dynamics and generalization, and outlines future work on broader architectures and parameter exploration.

Abstract

This study introduces a novel activation function, characterized by a dynamic slope that adjusts throughout the training process, aimed at enhancing adaptability and performance in deep neural networks for computer vision tasks. The rationale behind this approach is to overcome limitations associated with traditional activation functions, such as ReLU, by providing a more flexible mechanism that can adapt to different stages of the learning process. Evaluated on the Mini-ImageNet, CIFAR-100, and MIT-BIH datasets, our method demonstrated improvements in classification metrics and generalization capabilities. These results suggest that our dynamic slope activation function could offer a new tool for improving the performance of deep learning models in various image recognition tasks.

DSReLU: A Novel Dynamic Slope Function for Superior Model Training

TL;DR

Abstract

Paper Structure (20 sections, 5 equations, 5 figures, 4 tables)

This paper contains 20 sections, 5 equations, 5 figures, 4 tables.

Methodology
Activation Function Design
Mathematical Formulation:
Rationale:
Analysis of DSReLU Function
Experimental Setup and Model Architecture
Model Architecture
Loss Function and Optimizer
Hardware Used
Evaluation Metrics
Results and Discussion
Activation Functions Compared
Mini ImageNet
Performance on Training and validation sets during training on Mini Imagenet:
The CIFAR-100 dataset
...and 5 more sections

Figures (5)

Figure 1: Slope change of DSReLU with parameter $k$
Figure 2: Performance of different activation functions on the Mini-ImageNet Dataset
Figure 3: Performance of different activation functions on the CIFAR100 Dataset
Figure 4: Performance of different activation functions on the MIT-BIH Dataset
Figure 5: Mean training time of ResNet-34 model with different activation functions across different datasets.

DSReLU: A Novel Dynamic Slope Function for Superior Model Training

TL;DR

Abstract

DSReLU: A Novel Dynamic Slope Function for Superior Model Training

Authors

TL;DR

Abstract

Table of Contents

Figures (5)