Table of Contents
Fetching ...

AugLoss: A Robust Augmentation-based Fine Tuning Methodology

Kyle Otstot, Andrew Yang, John Kevin Cava, Lalitha Sankar

TL;DR

AugLoss tackles the dual challenge of train-time label noise and test-time distribution shifts by unifying AugMix-style data augmentation with tunable robust loss functions. The framework trains with augmented views and a consistency-friendly regularizer, while adopting robust losses (e.g., focal, NCE+RCE, alpha-loss) to resist label noise. Extensive experiments on CIFAR-10/100 and Tiny ImageNet show that AugLoss generally surpasses baselines across multiple corruption regimes, though no single loss is universally superior. The work provides a practical blueprint for designing more reliable DL models under real-world corruptions and suggests directions for future research with real-world datasets and enhanced augmentation strategies.

Abstract

Deep Learning (DL) models achieve great successes in many domains. However, DL models increasingly face safety and robustness concerns, including noisy labeling in the training stage and feature distribution shifts in the testing stage. Previous works made significant progress in addressing these problems, but the focus has largely been on developing solutions for only one problem at a time. For example, recent work has argued for the use of tunable robust loss functions to mitigate label noise, and data augmentation (e.g., AugMix) to combat distribution shifts. As a step towards addressing both problems simultaneously, we introduce AugLoss, a simple but effective methodology that achieves robustness against both train-time noisy labeling and test-time feature distribution shifts by unifying data augmentation and robust loss functions. We conduct comprehensive experiments in varied settings of real-world dataset corruption to showcase the gains achieved by AugLoss compared to previous state-of-the-art methods. Lastly, we hope this work will open new directions for designing more robust and reliable DL models under real-world corruptions.

AugLoss: A Robust Augmentation-based Fine Tuning Methodology

TL;DR

AugLoss tackles the dual challenge of train-time label noise and test-time distribution shifts by unifying AugMix-style data augmentation with tunable robust loss functions. The framework trains with augmented views and a consistency-friendly regularizer, while adopting robust losses (e.g., focal, NCE+RCE, alpha-loss) to resist label noise. Extensive experiments on CIFAR-10/100 and Tiny ImageNet show that AugLoss generally surpasses baselines across multiple corruption regimes, though no single loss is universally superior. The work provides a practical blueprint for designing more reliable DL models under real-world corruptions and suggests directions for future research with real-world datasets and enhanced augmentation strategies.

Abstract

Deep Learning (DL) models achieve great successes in many domains. However, DL models increasingly face safety and robustness concerns, including noisy labeling in the training stage and feature distribution shifts in the testing stage. Previous works made significant progress in addressing these problems, but the focus has largely been on developing solutions for only one problem at a time. For example, recent work has argued for the use of tunable robust loss functions to mitigate label noise, and data augmentation (e.g., AugMix) to combat distribution shifts. As a step towards addressing both problems simultaneously, we introduce AugLoss, a simple but effective methodology that achieves robustness against both train-time noisy labeling and test-time feature distribution shifts by unifying data augmentation and robust loss functions. We conduct comprehensive experiments in varied settings of real-world dataset corruption to showcase the gains achieved by AugLoss compared to previous state-of-the-art methods. Lastly, we hope this work will open new directions for designing more robust and reliable DL models under real-world corruptions.
Paper Structure (20 sections, 16 equations, 9 figures, 7 tables)

This paper contains 20 sections, 16 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: The 15 common corruptions on a HORSE-labeled image, found in the CIFAR-10-C dataset.
  • Figure 2: AugLoss: the unification of data augmentation and robust loss functions.
  • Figure 4: The performances of each method type across symmetric, asymmetric, and human-annotated settings of label noise. The Noisy Avg. results for CIFAR-10, CIFAR-100, and Tiny ImageNet are included in the synthetic panels, while the CIFAR-10N results are included in the human-annotated panel. Hatched bars indicate the best performing method types for each setting. Note that our proposed methodology, AugLoss, a.k.a. Augmix+Robust, is the clear winner in all settings considered.
  • Figure 5: The performances of each loss function across the synthetic (symmetric + asymmetric) settings of label noise. Dots are color-coded according to the best-performing loss functions at each setting and noise rate.
  • Figure 6: The performances of each loss function across the five CIFAR-10N datasets. Dots are color-coded according to the best-performing loss functions for each dataset.
  • ...and 4 more figures