On the Over-Memorization During Natural, Robust and Catastrophic Overfitting

Runqi Lin; Chaojian Yu; Bo Han; Tongliang Liu

On the Over-Memorization During Natural, Robust and Catastrophic Overfitting

Runqi Lin, Chaojian Yu, Bo Han, Tongliang Liu

TL;DR

The paper addresses overfitting in deep neural networks across natural, robust, and catastrophic regimes by proposing a unified view centered on over-memorization. It identifies over-memorization as a shared mechanism where models suddenly become high-confidence on certain training patterns and retain memory of them, with adversarial memorization often aligning with natural patterns. To counter this, it introduces Distraction Over-Memorization (DOM), which uses a fixed loss threshold to identify over-memorization and either removes or iteratively augments high-confidence patterns to reduce reliance on them. Extensive experiments across CIFAR-10/100, SVHN, Tiny-ImageNet, and ViT architectures show that DOM consistently mitigates overfitting and improves generalization and robustness across natural and adversarial training scenarios.

Abstract

Overfitting negatively impacts the generalization ability of deep neural networks (DNNs) in both natural and adversarial training. Existing methods struggle to consistently address different types of overfitting, typically designing strategies that focus separately on either natural or adversarial patterns. In this work, we adopt a unified perspective by solely focusing on natural patterns to explore different types of overfitting. Specifically, we examine the memorization effect in DNNs and reveal a shared behaviour termed over-memorization, which impairs their generalization capacity. This behaviour manifests as DNNs suddenly becoming high-confidence in predicting certain training patterns and retaining a persistent memory for them. Furthermore, when DNNs over-memorize an adversarial pattern, they tend to simultaneously exhibit high-confidence prediction for the corresponding natural pattern. These findings motivate us to holistically mitigate different types of overfitting by hindering the DNNs from over-memorization training patterns. To this end, we propose a general framework, Distraction Over-Memorization (DOM), which explicitly prevents over-memorization by either removing or augmenting the high-confidence natural patterns. Extensive experiments demonstrate the effectiveness of our proposed method in mitigating overfitting across various training paradigms.

On the Over-Memorization During Natural, Robust and Catastrophic Overfitting

TL;DR

Abstract

Paper Structure (22 sections, 2 equations, 6 figures, 13 tables, 1 algorithm)

This paper contains 22 sections, 2 equations, 6 figures, 13 tables, 1 algorithm.

Introduction
Related Work
Memorization Effect
Natural Overfitting
Robust and Catastrophic Overfitting
Understanding Overfitting in Various Training Paradigms
Over-Memorization in Natural Training
Over-Memorization in Adversarial Training
Proposed Approach
Experiments
Experiment Settings
Performance Evaluation
Ablation Studies
Conclusion
Detailed Experiment Settings
...and 7 more sections

Figures (6)

Figure 1: Left Panel: The training and test accuracy of natural training. Middle Panel: Proportion of training patterns based on varying loss ranges. Right Panel: Model's generalization gap after removing different categories of high-confidence (HC) patterns.
Figure 2: The loss curves for both original and transformed high-confidence (HC) patterns after removing all HC patterns.
Figure 3: 1st Panel: The training and test accuracy of adversarial training. 2nd/3rd Panel: Proportion of adversarial/natural patterns based on varying training loss ranges. 4th Panel: The overlap rate between natural and adversarial patterns grouped by training loss rankings.
Figure 4: The average loss of adversarial pattern grouped by natural training loss.
Figure 5: Ablation Study
...and 1 more figures

On the Over-Memorization During Natural, Robust and Catastrophic Overfitting

TL;DR

Abstract

On the Over-Memorization During Natural, Robust and Catastrophic Overfitting

Authors

TL;DR

Abstract

Table of Contents

Figures (6)