AdaResNet: Enhancing Residual Networks with Dynamic Weight Adjustment for Improved Feature Integration

Hong Su

AdaResNet: Enhancing Residual Networks with Dynamic Weight Adjustment for Improved Feature Integration

Hong Su

TL;DR

AdaResNet tackles the fixed 1:1 skip-connection fusion in ResNet by introducing a trainable weight $weight_{tfd}^{ipd}$ that dynamically balances input represent data ($ipd$) and transformed data ($tfd$) during training. The method provides gradient-based updates for the weight integrated into forward and backward passes, with potential per-layer or per-stage weighting. Empirical results on CIFAR-10 with ResNet-50 show substantial accuracy gains over traditional ResNet, and analysis reveals that optimal weights are layer- and task-dependent, supporting adaptive skip connections as a generalizable improvement. This approach offers a flexible mechanism to improve deep network training and generalization across architectures and datasets.

Abstract

In very deep neural networks, gradients can become extremely small during backpropagation, making it challenging to train the early layers. ResNet (Residual Network) addresses this issue by enabling gradients to flow directly through the network via skip connections, facilitating the training of much deeper networks. However, in these skip connections, the input ipd is directly added to the transformed data tfd, treating ipd and tfd equally, without adapting to different scenarios. In this paper, we propose AdaResNet (Auto-Adapting Residual Network), which automatically adjusts the ratio between ipd and tfd based on the training data. We introduce a variable, weight}_{tfd}^{ipd, to represent this ratio. This variable is dynamically adjusted during backpropagation, allowing it to adapt to the training data rather than remaining fixed. Experimental results demonstrate that AdaResNet achieves a maximum accuracy improvement of over 50\% compared to traditional ResNet.

AdaResNet: Enhancing Residual Networks with Dynamic Weight Adjustment for Improved Feature Integration

TL;DR

AdaResNet tackles the fixed 1:1 skip-connection fusion in ResNet by introducing a trainable weight

that dynamically balances input represent data (

) and transformed data (

) during training. The method provides gradient-based updates for the weight integrated into forward and backward passes, with potential per-layer or per-stage weighting. Empirical results on CIFAR-10 with ResNet-50 show substantial accuracy gains over traditional ResNet, and analysis reveals that optimal weights are layer- and task-dependent, supporting adaptive skip connections as a generalizable improvement. This approach offers a flexible mechanism to improve deep network training and generalization across architectures and datasets.

Abstract

Paper Structure (30 sections, 13 equations, 7 figures, 2 tables, 3 algorithms)

This paper contains 30 sections, 13 equations, 7 figures, 2 tables, 3 algorithms.

Introduction
Model
Gradient Descent Algorithm
Gradient of the Loss Function with Respect to the Output $\mathbf{y}$
Gradient of the Output $\mathbf{y}$ with Respect to $weight_{tfd}^{ipd}$
Gradient of the Loss Function with Respect to $weight_{tfd}^{ipd}$
Parameter Update
Training Neural Network with Custom Parameter $weight_{tfd}^{ipd}$
Forward Pass of $weight_{tfd}^{ipd}$
Calculating the Loss Function
Backward Pass
Updating the Parameters
Brief Explanation
Factors influencing $\textit{weight}_{tfd}^{ipd}$
Dependency on Training Datasets
...and 15 more sections

Figures (7)

Figure 1: ResNet to add the input and intermediately processed directly to increase gradients for deep neural network
Figure 2: Incorporating weighting into residual learning and blocks
Figure 3: Comparison of training accuracy
Figure 4: Comparison of test accuracy
Figure 5: Weights of different layers
...and 2 more figures

Theorems & Definitions (2)

Definition 1
Definition 2

AdaResNet: Enhancing Residual Networks with Dynamic Weight Adjustment for Improved Feature Integration

TL;DR

Abstract

AdaResNet: Enhancing Residual Networks with Dynamic Weight Adjustment for Improved Feature Integration

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (2)