Table of Contents
Fetching ...

Domain Similarity-Perceived Label Assignment for Domain Generalized Underwater Object Detection

Xisheng Li, Wei Li, Pinhao Song, Mingjun Zhang, Jie Zhou

TL;DR

Through domainspecific data augmentation techniques, this paper achieved state-of-the-art results on the underwater cross-domain object detection benchmark S-UODAC2020 and validated the effectiveness of the method in the Cityscapes dataset.

Abstract

The inherent characteristics and light fluctuations of water bodies give rise to the huge difference between different layers and regions in underwater environments. When the test set is collected in a different marine area from the training set, the issue of domain shift emerges, significantly compromising the model's ability to generalize. The Domain Adversarial Learning (DAL) training strategy has been previously utilized to tackle such challenges. However, DAL heavily depends on manually one-hot domain labels, which implies no difference among the samples in the same domain. Such an assumption results in the instability of DAL. This paper introduces the concept of Domain Similarity-Perceived Label Assignment (DSP). The domain label for each image is regarded as its similarity to the specified domains. Through domain-specific data augmentation techniques, we achieved state-of-the-art results on the underwater cross-domain object detection benchmark S-UODAC2020. Furthermore, we validated the effectiveness of our method in the Cityscapes dataset.

Domain Similarity-Perceived Label Assignment for Domain Generalized Underwater Object Detection

TL;DR

Through domainspecific data augmentation techniques, this paper achieved state-of-the-art results on the underwater cross-domain object detection benchmark S-UODAC2020 and validated the effectiveness of the method in the Cityscapes dataset.

Abstract

The inherent characteristics and light fluctuations of water bodies give rise to the huge difference between different layers and regions in underwater environments. When the test set is collected in a different marine area from the training set, the issue of domain shift emerges, significantly compromising the model's ability to generalize. The Domain Adversarial Learning (DAL) training strategy has been previously utilized to tackle such challenges. However, DAL heavily depends on manually one-hot domain labels, which implies no difference among the samples in the same domain. Such an assumption results in the instability of DAL. This paper introduces the concept of Domain Similarity-Perceived Label Assignment (DSP). The domain label for each image is regarded as its similarity to the specified domains. Through domain-specific data augmentation techniques, we achieved state-of-the-art results on the underwater cross-domain object detection benchmark S-UODAC2020. Furthermore, we validated the effectiveness of our method in the Cityscapes dataset.
Paper Structure (24 sections, 4 equations, 7 figures, 6 tables)

This paper contains 24 sections, 4 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: DSP is a preprocessing module used before model training. It utilizes a pre-trained model from ImageNet to extract low-level semantic information from images and aggregate their feature statistics. These statistics are then stacked together, and base domains are selected from them using the Farthest Feature Sampling method. Subsequently, each base domain's image quantity is augmented using Adaptive Instance Normalization (AdaIN). These augmented images are fed into the Domain Classifier. Finally, the Input Images are passed through the Domain Classifier for inference, yielding domain labels.
  • Figure 2: SCG* is a method that generates variations of each image in the dataset by solely altering the low-frequency information while preserving the core content but introducing different styles.
  • Figure 3: S-UODAC2020 is a dataset for underwater cross-domain object detection, comprising four marine species: echinus, holothurian, scallop, and starfish. The training set consists of 4,745 images sourced from six distinct domains, while the test set comprises 797 images from domains distinct from those in the training set.
  • Figure 4: Visualizing different methods using t-SNE. Red points denote data from the source domain, while blue points represent data from the target domain. These features are extracted from the final stage of ResNet using various methods.
  • Figure 5: Comparison of actual detection results under different methods. Different colored boxes represent the discovery of various underwater creatures. The red box signifies the presence of echinus, the blue box represents scallop, the yellow box indicates starfish, and the green box denotes holothurian.
  • ...and 2 more figures