Table of Contents
Fetching ...

Enhanced Model Robustness to Input Corruptions by Per-corruption Adaptation of Normalization Statistics

Elena Camuffo, Umberto Michieli, Simone Milani, Jijoong Moon, Mete Ozay

TL;DR

PAN addresses the challenge of robustness to diverse input corruptions in robotic vision by introducing per-corruption normalization statistics. It combines a Corruption Identification Module (CIM), per-corruption BN adaptation with a codebook mapping to corruption-specific BN parameters, and test-time adaptation to tailor statistics to the detected corruption. Empirical results across synthetic benchmarks and real-world robotic datasets show PAN achieves substantial accuracy gains with minimal overhead, outperforming several data-augmentation and general TTA baselines. The approach is lightweight, model-agnostic, and readily deployable on edge robotic systems for improved reliability in adverse conditions.

Abstract

Developing a reliable vision system is a fundamental challenge for robotic technologies (e.g., indoor service robots and outdoor autonomous robots) which can ensure reliable navigation even in challenging environments such as adverse weather conditions (e.g., fog, rain), poor lighting conditions (e.g., over/under exposure), or sensor degradation (e.g., blurring, noise), and can guarantee high performance in safety-critical functions. Current solutions proposed to improve model robustness usually rely on generic data augmentation techniques or employ costly test-time adaptation methods. In addition, most approaches focus on addressing a single vision task (typically, image recognition) utilising synthetic data. In this paper, we introduce Per-corruption Adaptation of Normalization statistics (PAN) to enhance the model robustness of vision systems. Our approach entails three key components: (i) a corruption type identification module, (ii) dynamic adjustment of normalization layer statistics based on identified corruption type, and (iii) real-time update of these statistics according to input data. PAN can integrate seamlessly with any convolutional model for enhanced accuracy in several robot vision tasks. In our experiments, PAN obtains robust performance improvement on challenging real-world corrupted image datasets (e.g., OpenLoris, ExDark, ACDC), where most of the current solutions tend to fail. Moreover, PAN outperforms the baseline models by 20-30% on synthetic benchmarks in object recognition tasks.

Enhanced Model Robustness to Input Corruptions by Per-corruption Adaptation of Normalization Statistics

TL;DR

PAN addresses the challenge of robustness to diverse input corruptions in robotic vision by introducing per-corruption normalization statistics. It combines a Corruption Identification Module (CIM), per-corruption BN adaptation with a codebook mapping to corruption-specific BN parameters, and test-time adaptation to tailor statistics to the detected corruption. Empirical results across synthetic benchmarks and real-world robotic datasets show PAN achieves substantial accuracy gains with minimal overhead, outperforming several data-augmentation and general TTA baselines. The approach is lightweight, model-agnostic, and readily deployable on edge robotic systems for improved reliability in adverse conditions.

Abstract

Developing a reliable vision system is a fundamental challenge for robotic technologies (e.g., indoor service robots and outdoor autonomous robots) which can ensure reliable navigation even in challenging environments such as adverse weather conditions (e.g., fog, rain), poor lighting conditions (e.g., over/under exposure), or sensor degradation (e.g., blurring, noise), and can guarantee high performance in safety-critical functions. Current solutions proposed to improve model robustness usually rely on generic data augmentation techniques or employ costly test-time adaptation methods. In addition, most approaches focus on addressing a single vision task (typically, image recognition) utilising synthetic data. In this paper, we introduce Per-corruption Adaptation of Normalization statistics (PAN) to enhance the model robustness of vision systems. Our approach entails three key components: (i) a corruption type identification module, (ii) dynamic adjustment of normalization layer statistics based on identified corruption type, and (iii) real-time update of these statistics according to input data. PAN can integrate seamlessly with any convolutional model for enhanced accuracy in several robot vision tasks. In our experiments, PAN obtains robust performance improvement on challenging real-world corrupted image datasets (e.g., OpenLoris, ExDark, ACDC), where most of the current solutions tend to fail. Moreover, PAN outperforms the baseline models by 20-30% on synthetic benchmarks in object recognition tasks.
Paper Structure (12 sections, 6 equations, 6 figures, 4 tables)

This paper contains 12 sections, 6 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Our approach enhances robot vision systems via per-corruption adaptive normalization of neural network models. This is fundamental in challenging environmental situations with corrupted images. Our proposed PAN is built on (i) a corruption identification module (CIM) that extracts per-corruption features in order to recognize the input corruption, (ii) an inexpensive test-time adaptation step to adapt model parameters to the specific corruption type, (iii) a codebook to map features to candidate parameters.
  • Figure 2: Statistics estimated at normalization layers vary depending on the image corruption type, averaged on all layers (ResNet18 on ImageNet-C). Unlike classical data augmentation approaches where a single set of normalization statistics is estimated for all corruption types on a source domain (red), our method estimates normalization statistics for each corruption (blue), which are very close to the reference ones, estimated assuming that the true corruption type of the data is known (green).
  • Figure 3: Mean and variance distributions of the output of the first BN layer when encountering clean data, contrast corrupted data and shot noise corrupted data (ResNet18 on ImageNet-C).
  • Figure 4: The corruption identification module (Sec. \ref{['sec:corrid']}) is trained on corrupted training images and a set of corruption-related prototypical features $\bar{\mathbf{z}_{1,\dots,K}}$ is built, by averaging features $\mathbf{z}$ relative to each corruption. Then, at inference time, the CIM is frozen and a Codebook$\mathfrak{C}$ (Sec. \ref{['sec:codebook']}) maps the corruption identified by the CIM to the respective corruption-specific BN parameters. Such parameters are initialized with the ones of the pre-trained downstream task model $F(\cdot)$ on clean source images $\mathcal{X}^{\mathcal{S}}$ and adapted to test images via TTA, separately for each identified corruption $\hat{\kappa}$ (Sec. \ref{['sec:specialization']}), obtaining a corruption-specific set $\Lambda_{\hat{\kappa}}^{\mathcal{T}}$. Finally, $\Lambda_{\hat{\kappa}}^{\mathcal{T}}$ is plugged into $F(\cdot)$ achieving enhanced robustness on downstream tasks, specifically on the identified corruption. The systems stores $\mathbf{z}_{1,\dots,K}$ and $\Lambda^{\mathcal{T}}_{1,\dots,K}$ to use and update them while doing inference.
  • Figure 5: Per-Corruption qualitative results of semantic segmentation using the DeepLabV2 deeplab2_2021 with the ACDC SDV21 dataset.
  • ...and 1 more figures