Table of Contents
Fetching ...

Control+Shift: Generating Controllable Distribution Shifts

Roy Friedman, Rhea Chowers

TL;DR

This work proposes a new method for generating realistic datasets with distribution shifts using any decoder-based generative model and finds that enlarging the training dataset beyond a certain point has no effect on the robustness and that stronger inductive biases increase robustness.

Abstract

We propose a new method for generating realistic datasets with distribution shifts using any decoder-based generative model. Our approach systematically creates datasets with varying intensities of distribution shifts, facilitating a comprehensive analysis of model performance degradation. We then use these generated datasets to evaluate the performance of various commonly used networks and observe a consistent decline in performance with increasing shift intensity, even when the effect is almost perceptually unnoticeable to the human eye. We see this degradation even when using data augmentations. We also find that enlarging the training dataset beyond a certain point has no effect on the robustness and that stronger inductive biases increase robustness.

Control+Shift: Generating Controllable Distribution Shifts

TL;DR

This work proposes a new method for generating realistic datasets with distribution shifts using any decoder-based generative model and finds that enlarging the training dataset beyond a certain point has no effect on the robustness and that stronger inductive biases increase robustness.

Abstract

We propose a new method for generating realistic datasets with distribution shifts using any decoder-based generative model. Our approach systematically creates datasets with varying intensities of distribution shifts, facilitating a comprehensive analysis of model performance degradation. We then use these generated datasets to evaluate the performance of various commonly used networks and observe a consistent decline in performance with increasing shift intensity, even when the effect is almost perceptually unnoticeable to the human eye. We see this degradation even when using data augmentations. We also find that enlarging the training dataset beyond a certain point has no effect on the robustness and that stronger inductive biases increase robustness.
Paper Structure (20 sections, 4 equations, 9 figures)

This paper contains 20 sections, 4 equations, 9 figures.

Figures (9)

  • Figure 1: On the left, the effects for a specific distribution shift is displayed as the accuracy on the shifted distribution versus the accuracy on the training distribution. For this plot, the train and shifted distribution are from our "extend shift" dataset described in \ref{['sec:generated-data']}. The center and right plots show a schematic for distribution shift in the real world, where the amount of shift can be either modest in some sense (center) or severe (right). In both cases, we expect the training distribution to be a subset of the full distribution in question and the distribution shift is a distinct subset from the training distribution.
  • Figure 2: A schematic for the generation processes we explore (top) and images generated according to this procedure (bottom). In all types of shifts, the severity of the shift is controllable, while still resulting in CIFAR10-like images. The bottom half of the figure shows images generated with the same seed at different shift intensities, with the left-most column in each depicting the training set.
  • Figure 3: The accuracy of different models trained on the "overlap" (left), "extend" (center) and "truncation" (right) datasets and tested on the appropriate shifted distribution. The top row shows results for the CIFAR10 variant of the distribution shifts and the bottom for the ImageNet generated shifts. In all cases, the trained models are not robust to the generated distribution shifts we present. The controlling parameters in this figure are the same as those shown in \ref{['fig:shift-types']}, and the drop in accuracy is on the corresponding images.
  • Figure 4: The accuracy as a function of shift on the "extend" dataset for both the CIFAR10 (top) and ImageNet (bottom) variants, without and with augmentation. Additionally, we show plots of the robustness slope as a function of the test accuracy of the models. In all cases, augmentation drastically helps with robustness.
  • Figure 5: Left: $\Delta$-accuracy as a function of overlap angle in the "overlap" dataset for different sizes and supports of training distribution. Center: $\Delta$-accuracy as a function of overlap angle in the "overlap" dataset for models with different inductive biases. Right: the robustness slope as a function of test accuracy for all the models evaluated in the left and center plot.
  • ...and 4 more figures

Theorems & Definitions (1)

  • definition thmcounterdefinition: distribution shift and shift intensity