FFT-based Selection and Optimization of Statistics for Robust Recognition of Severely Corrupted Images
Elena Camuffo, Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay
TL;DR
This work tackles robust object recognition under severe image corruptions by introducing FROST, a test-time method that uses high-frequency FFT features to identify the corruption type and select normalization statistics accordingly. It constructs corruption prototypes from the first $n$ high-frequency FFT amplitudes, with $n=15$, across synthetic corruptions with intensity levels $\lambda \in \{1,2,3,4,5\}$, and maps these prototypes to corruption-generic or corruption-specific statistics via a codebook. At inference, it matches the input FFT signature to prototypes to choose $S^*$ and applies it to BN/LN layers, employing a confidence threshold $T$ to handle uncertainty. Empirically on ImageNet-C, FROST yields state-of-the-art mean corruption error reductions while preserving clean accuracy, with low memory overhead and broad applicability across architectures.
Abstract
Improving model robustness in case of corrupted images is among the key challenges to enable robust vision systems on smart devices, such as robotic agents. Particularly, robust test-time performance is imperative for most of the applications. This paper presents a novel approach to improve robustness of any classification model, especially on severely corrupted images. Our method (FROST) employs high-frequency features to detect input image corruption type, and select layer-wise feature normalization statistics. FROST provides the state-of-the-art results for different models and datasets, outperforming competitors on ImageNet-C by up to 37.1% relative gain, improving baseline of 40.9% mCE on severe corruptions.
