Using Pre-Training Can Improve Model Robustness and Uncertainty
Dan Hendrycks, Kimin Lee, Mantas Mazeika
TL;DR
The paper challenges the claim that pre-training only speeds up learning by showing that pre-training can substantially improve robustness to adversarial perturbations, label noise, and class imbalance, as well as uncertainty measures like OOD detection and calibration. It introduces adversarial pre-training and demonstrates large, sometimes state-of-the-art gains across CIFAR and ImageNet-derived settings, often outperforming task-specific methods. The results advocate adopting a pre-train-then-tune paradigm and suggest evaluating robustness and uncertainty techniques with pre-trained models to obtain realistic assessments. Overall, pre-training provides benefits beyond convergence speed, enhancing reliability and safety in deep learning systems.
Abstract
He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance to pre-training. We show that although pre-training may not improve performance on traditional classification metrics, it improves model robustness and uncertainty estimates. Through extensive experiments on adversarial examples, label corruption, class imbalance, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. We introduce adversarial pre-training and show approximately a 10% absolute improvement over the previous state-of-the-art in adversarial robustness. In some cases, using pre-training without task-specific methods also surpasses the state-of-the-art, highlighting the need for pre-training when evaluating future methods on robustness and uncertainty tasks.
