Pretraining with random noise for uncertainty calibration
Jeonghwan Cheon, Se-Bum Paik
TL;DR
The paper addresses uncertainty calibration in deep neural networks, where models are overconfident and unreliable on unseen data. It proposes a simple, biologically inspired approach: pretrain networks with random noise and random labels to pre-calibrate their uncertainty before standard data training. The results show that this random noise pretraining reduces overconfidence, aligns predicted confidence with actual accuracy, and improves out-of-distribution detection without extra processing. The work suggests a universal initialization strategy with practical implications for safer, more robust AI systems and offers insights into prenatal learning processes.
Abstract
Uncertainty calibration is crucial for various machine learning applications, yet it remains challenging. Many models exhibit hallucinations - confident yet inaccurate responses - due to miscalibrated confidence. Here, we show that the common practice of random initialization in deep learning, often considered a standard technique, is an underlying cause of this miscalibration, leading to excessively high confidence in untrained networks. Our method, inspired by developmental neuroscience, addresses this issue by simply pretraining networks with random noise and labels, reducing overconfidence and bringing initial confidence levels closer to chance. This ensures optimal calibration, aligning confidence with accuracy during subsequent data training, without the need for additional pre- or post-processing. Pre-calibrated networks excel at identifying "unknown data," showing low confidence for out-of-distribution inputs, thereby resolving confidence miscalibration.
