Table of Contents
Fetching ...

UNet-AF: An alias-free UNet for image restoration

Jérémy Scanvic, Quentin Barthélemy, Julián Tachella

Abstract

The simplicity and effectiveness of the UNet architecture makes it ubiquitous in image restoration, image segmentation, and diffusion models. They are often assumed to be equivariant to translations, yet they traditionally consist of layers that are known to be prone to aliasing, which hinders their equivariance in practice. To overcome this limitation, we propose a new alias-free UNet designed from a careful selection of state-of-the-art translation-equivariant layers. We evaluate the proposed equivariant architecture against non-equivariant baselines on image restoration tasks and observe competitive performance with a significant increase in measured equivariance. Through extensive ablation studies, we also demonstrate that each change is crucial for its empirical equivariance. Our implementation is available at https://github.com/jscanvic/UNet-AF

UNet-AF: An alias-free UNet for image restoration

Abstract

The simplicity and effectiveness of the UNet architecture makes it ubiquitous in image restoration, image segmentation, and diffusion models. They are often assumed to be equivariant to translations, yet they traditionally consist of layers that are known to be prone to aliasing, which hinders their equivariance in practice. To overcome this limitation, we propose a new alias-free UNet designed from a careful selection of state-of-the-art translation-equivariant layers. We evaluate the proposed equivariant architecture against non-equivariant baselines on image restoration tasks and observe competitive performance with a significant increase in measured equivariance. Through extensive ablation studies, we also demonstrate that each change is crucial for its empirical equivariance. Our implementation is available at https://github.com/jscanvic/UNet-AF
Paper Structure (13 sections, 3 equations, 4 figures, 3 tables)

This paper contains 13 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Equivariance and robustness to circular translations. (Top) Performance on an horizontally translated image, with 0.01 px steps, (bottom) adversarial performance on the worst translation up to a maximum displacement, with 0.25 px steps. UNet-AF is stable and robust, unlike the baselines.
  • Figure 2: Overview of the architecture. UNet-AF is obtained by changing the architectures of ronneberger15UNet and jin17Deep. Anti-aliasing filters are added in upsampling layers, max pooling layers are replaced with blur pooling layers, ReLU activations are replaced with filtered GELU layers and batch normalization is replaced by alias-free layer normalization. The padding mode in convolutions is left unconstrained and the residual connection is optional.
  • Figure 3: Reconstructions of a circularly blurred image. The PSNR is displayed in the bottom-left corner of each image.
  • Figure 4: Evaluation PSNR throughout training for circular deblurring. Our proposed model has significantly smoother training dynamics than the non-equivariant baselines.