Table of Contents
Fetching ...

DUA-D2C: Dynamic Uncertainty Aware Method for Overfitting Remediation in Deep Learning

Md. Saiful Bari Siddiqui, Md Mohaiminul Islam, Md. Golam Rabiul Alam

TL;DR

DUA-D2C tackles deep-learning overfitting by partitioning data into $N$ shards, training edge models, and dynamically weighting their contributions on a shared validation set using both accuracy $a_i$ and uncertainty $u_i$. The central model is updated as a weighted sum $\theta_c \leftarrow \sum_i \alpha_i \theta'_i$, with $\alpha_i$ derived from $s_i = \lambda a_i + (1-\lambda) u_i$, enabling variance reduction beyond uniform averaging. The framework is validated across image, audio, and text tasks, showing improved generalization, delayed overfitting, and smoother decision boundaries, while also allowing augmentation to further bolster performance. While acknowledging increased training time, the approach demonstrates robust gains and compatibility with existing regularization techniques, making it a practical tool for generalization in modern deep learning, especially in large-scale settings.

Abstract

Overfitting remains a significant challenge in deep learning, often arising from data outliers, noise, and limited training data. To address this, the Divide2Conquer (D2C) method was previously proposed, which partitions training data into multiple subsets and trains identical models independently on each. This strategy enables learning more consistent patterns while minimizing the influence of individual outliers and noise. However, D2C's standard aggregation typically treats all subset models equally or based on fixed heuristics (like data size), potentially underutilizing information about their varying generalization capabilities. Building upon this foundation, we introduce Dynamic Uncertainty-Aware Divide2Conquer (DUA-D2C), an advanced technique that refines the aggregation process. DUA-D2C dynamically weights the contributions of subset models based on their performance on a shared validation set, considering both accuracy and prediction uncertainty. This intelligent aggregation allows the central model to preferentially learn from subsets yielding more generalizable and confident edge models, thereby more effectively combating overfitting. Empirical evaluations on benchmark datasets spanning multiple domains demonstrate that DUA-D2C significantly improves generalization. Our analysis includes evaluations of decision boundaries, loss curves, and other performance metrics, highlighting the effectiveness of DUA-D2C. This study demonstrates that DUA-D2C improves generalization performance even when applied on top of other regularization methods, establishing it as a theoretically grounded and effective approach to combating overfitting in modern deep learning. Our codes are publicly available at: https://github.com/Saiful185/DUA-D2C.

DUA-D2C: Dynamic Uncertainty Aware Method for Overfitting Remediation in Deep Learning

TL;DR

DUA-D2C tackles deep-learning overfitting by partitioning data into shards, training edge models, and dynamically weighting their contributions on a shared validation set using both accuracy and uncertainty . The central model is updated as a weighted sum , with derived from , enabling variance reduction beyond uniform averaging. The framework is validated across image, audio, and text tasks, showing improved generalization, delayed overfitting, and smoother decision boundaries, while also allowing augmentation to further bolster performance. While acknowledging increased training time, the approach demonstrates robust gains and compatibility with existing regularization techniques, making it a practical tool for generalization in modern deep learning, especially in large-scale settings.

Abstract

Overfitting remains a significant challenge in deep learning, often arising from data outliers, noise, and limited training data. To address this, the Divide2Conquer (D2C) method was previously proposed, which partitions training data into multiple subsets and trains identical models independently on each. This strategy enables learning more consistent patterns while minimizing the influence of individual outliers and noise. However, D2C's standard aggregation typically treats all subset models equally or based on fixed heuristics (like data size), potentially underutilizing information about their varying generalization capabilities. Building upon this foundation, we introduce Dynamic Uncertainty-Aware Divide2Conquer (DUA-D2C), an advanced technique that refines the aggregation process. DUA-D2C dynamically weights the contributions of subset models based on their performance on a shared validation set, considering both accuracy and prediction uncertainty. This intelligent aggregation allows the central model to preferentially learn from subsets yielding more generalizable and confident edge models, thereby more effectively combating overfitting. Empirical evaluations on benchmark datasets spanning multiple domains demonstrate that DUA-D2C significantly improves generalization. Our analysis includes evaluations of decision boundaries, loss curves, and other performance metrics, highlighting the effectiveness of DUA-D2C. This study demonstrates that DUA-D2C improves generalization performance even when applied on top of other regularization methods, establishing it as a theoretically grounded and effective approach to combating overfitting in modern deep learning. Our codes are publicly available at: https://github.com/Saiful185/DUA-D2C.

Paper Structure

This paper contains 23 sections, 25 equations, 15 figures, 8 tables, 1 algorithm.

Figures (15)

  • Figure 1: When the corresponding complexity term grows and dominates the expression in Eq.(\ref{['eq7']}) for empirical error, the model starts to overfit as empirical error becomes no longer a true representation of true error.
  • Figure 2: A visualization of (a) Underfitting, (b) A good fit, and (c) Overfitting scenarios, illustrating how the model changes if an arbitrary training data point $(x,y)$ is shifted to $(p,q)$ by plotting the trained model before (solid line) and after (dashed line) the shift.
  • Figure 3: Decision boundary visualization with the Traditional approach vs. Divide2Conquer (D2C).
  • Figure 4: Decision Boundary from the edge models trained on each subset after dividing the training data into 3 subsets using the proposed method.
  • Figure 5: Validation loss curves for the traditional & the DUA-D2C method for different numbers of subsets, N, on CIFAR-10.
  • ...and 10 more figures