Table of Contents
Fetching ...

Domain Adaptation for Camera-Specific Image Characteristics using Shallow Discriminators

Maximiliane Gruber, Jürgen Seiler, André Kaup

TL;DR

It is shown that a smaller receptive field size improves learning of unknown image distortions by more accurately reproducing local distortion characteristics at a low network complexity.

Abstract

Each image acquisition setup leads to its own camera-specific image characteristics degrading the image quality. In learning-based perception algorithms, characteristics occurring during the application phase, but absent in the training data, lead to a domain gap impeding the performance. Previously, pixel-level domain adaptation through unpaired learning of the pristine-to-distorted mapping function has been proposed. In this work, we propose shallow discriminator architectures to address limitations of these approaches. We show that a smaller receptive field size improves learning of unknown image distortions by more accurately reproducing local distortion characteristics at a low network complexity. In a domain adaptation setup for instance segmentation, we achieve mean average precision increases over previous methods of up to 0.15 for individual distortions and up to 0.16 for camera-specific image characteristics in a simplified camera model. In terms of number of parameters, our approach matches the complexity of one state of the art method while reducing complexity by a factor of 20 compared to another, demonstrating superior efficiency without compromising performance.

Domain Adaptation for Camera-Specific Image Characteristics using Shallow Discriminators

TL;DR

It is shown that a smaller receptive field size improves learning of unknown image distortions by more accurately reproducing local distortion characteristics at a low network complexity.

Abstract

Each image acquisition setup leads to its own camera-specific image characteristics degrading the image quality. In learning-based perception algorithms, characteristics occurring during the application phase, but absent in the training data, lead to a domain gap impeding the performance. Previously, pixel-level domain adaptation through unpaired learning of the pristine-to-distorted mapping function has been proposed. In this work, we propose shallow discriminator architectures to address limitations of these approaches. We show that a smaller receptive field size improves learning of unknown image distortions by more accurately reproducing local distortion characteristics at a low network complexity. In a domain adaptation setup for instance segmentation, we achieve mean average precision increases over previous methods of up to 0.15 for individual distortions and up to 0.16 for camera-specific image characteristics in a simplified camera model. In terms of number of parameters, our approach matches the complexity of one state of the art method while reducing complexity by a factor of 20 compared to another, demonstrating superior efficiency without compromising performance.

Paper Structure

This paper contains 10 sections, 2 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Camera-specific image characteristics result from the unknown, unique combination of individual image distortions (blue boxes) occurring along the imaging chain (yellow boxes).
  • Figure 2: Unpaired learning of unknown image distortions from pristine source domain $X$ to distorted target domain $Y$. Pristine-to-distortion mapping function $\mathcal{C}$ (unattainable in practice) is approximated by forward generator $G$.
  • Figure 3: Network architecture of discriminators. Conv $c/k/s \downarrow$ denotes a convolutional layer with $c$ output channels, a kernel size of $k$ and a stride of $s$. All convolutional layers marked in purple are followed by BatchNorm and Leaky ReLu. All layers marked in green are followed by Leaky ReLu.
  • Figure 4: Training procedure for pixel-level domain adaptation. True pristine-to-distorted mapping function $\mathcal{C}$ denotes mapping from pristine source domain $X$ to distorted target domain $Y$. $\tilde{\mathcal{C}}$ denotes learned mapping.
  • Figure 5: Results of instance segmentation measured as mAP over distortion level for various distortion types.
  • ...and 2 more figures