Table of Contents
Fetching ...

Diffusion-based Data Augmentation and Knowledge Distillation with Generated Soft Labels Solving Data Scarcity Problems of SAR Oil Spill Segmentation

Jaeho Moon, Jeonghwan Yun, Jaehyun Kim, Jaehyup Lee, Munchurl Kim

TL;DR

This work tackles the data scarcity problem in SAR oil spill segmentation by introducing DAKTer, a diffusion-based data augmentation and knowledge transfer framework that jointly generates SAR images and per-pixel soft labels. A key contribution is the SNR-based balancing factor $b$, which stabilizes joint generation of image and soft-label modalities during diffusion, enabling effective knowledge transfer to a student segmentation model via a soft-label KD loss. Empirical results on the OSD and SOS datasets show that DAKTer outperforms existing diffusion-based DA methods and KD baselines, with notable gains in mIoU and F1 across multiple segmentation backbones, driven by the richer supervision provided by soft labels. The approach enhances robustness and generalization for SAR oil spill monitoring and holds practical potential for integration into real-world marine surveillance systems.

Abstract

Oil spills pose severe environmental risks, making early detection crucial for effective response and mitigation. As Synthetic Aperture Radar (SAR) images operate under all-weather conditions, SAR-based oil spill segmentation enables fast and robust monitoring. However, when using deep learning models, SAR oil spill segmentation often struggles in training due to the scarcity of labeled data. To address this limitation, we propose a diffusion-based data augmentation with knowledge transfer (DAKTer) strategy. Our DAKTer strategy enables a diffusion model to generate SAR oil spill images along with soft label pairs, which offer richer class probability distributions than segmentation masks (i.e. hard labels). Also, for reliable joint generation of high-quality SAR images and well-aligned soft labels, we introduce an SNR-based balancing factor aligning the noise corruption process of both modalilties in diffusion models. By leveraging the generated SAR images and soft labels, a student segmentation model can learn robust feature representations without teacher models trained for the same task, improving its ability to segment oil spill regions. Extensive experiments demonstrate that our DAKTer strategy effectively transfers the knowledge of per-pixel class probabilities to the student segmentation model to distinguish the oil spill regions from other look-alike regions in the SAR images. Our DAKTer strategy boosts various segmentation models to achieve superior performance with large margins compared to other generative data augmentation methods.

Diffusion-based Data Augmentation and Knowledge Distillation with Generated Soft Labels Solving Data Scarcity Problems of SAR Oil Spill Segmentation

TL;DR

This work tackles the data scarcity problem in SAR oil spill segmentation by introducing DAKTer, a diffusion-based data augmentation and knowledge transfer framework that jointly generates SAR images and per-pixel soft labels. A key contribution is the SNR-based balancing factor , which stabilizes joint generation of image and soft-label modalities during diffusion, enabling effective knowledge transfer to a student segmentation model via a soft-label KD loss. Empirical results on the OSD and SOS datasets show that DAKTer outperforms existing diffusion-based DA methods and KD baselines, with notable gains in mIoU and F1 across multiple segmentation backbones, driven by the richer supervision provided by soft labels. The approach enhances robustness and generalization for SAR oil spill monitoring and holds practical potential for integration into real-world marine surveillance systems.

Abstract

Oil spills pose severe environmental risks, making early detection crucial for effective response and mitigation. As Synthetic Aperture Radar (SAR) images operate under all-weather conditions, SAR-based oil spill segmentation enables fast and robust monitoring. However, when using deep learning models, SAR oil spill segmentation often struggles in training due to the scarcity of labeled data. To address this limitation, we propose a diffusion-based data augmentation with knowledge transfer (DAKTer) strategy. Our DAKTer strategy enables a diffusion model to generate SAR oil spill images along with soft label pairs, which offer richer class probability distributions than segmentation masks (i.e. hard labels). Also, for reliable joint generation of high-quality SAR images and well-aligned soft labels, we introduce an SNR-based balancing factor aligning the noise corruption process of both modalilties in diffusion models. By leveraging the generated SAR images and soft labels, a student segmentation model can learn robust feature representations without teacher models trained for the same task, improving its ability to segment oil spill regions. Extensive experiments demonstrate that our DAKTer strategy effectively transfers the knowledge of per-pixel class probabilities to the student segmentation model to distinguish the oil spill regions from other look-alike regions in the SAR images. Our DAKTer strategy boosts various segmentation models to achieve superior performance with large margins compared to other generative data augmentation methods.

Paper Structure

This paper contains 31 sections, 16 equations, 12 figures, 4 tables, 2 algorithms.

Figures (12)

  • Figure 1: Overview of our Data Augmentation and Knowledge Transfer (DAKTer) strategy. A diffusion-based data augmentation model synthesizes an augmented dataset $\mathcal{D}^a$ consisting of image-soft label pairs. Additionally leveraging $\mathcal{D}^a$, a student segmentation model can be trained with more informative soft labels, compared to segmentation masks (hard labels).
  • Figure 2: Visualization of generated outputs from DDPM ddpm trained with our DAKTer strategy. (a) Generated SAR image with oil spill region (the dark region on the left). (b) Generated soft label containing probability (prob.) maps for each class (scaled for better visualization). (c) Prob. values are high in the oil spill region. (d) As 'look-alike' class is shown to be dark in SAR images similar to 'oil-spill' class, the prob. values are relatively higher (yellow regions) in the oil spill region than those in other regions. (e) As 'Land' class appears as bright in SAR images (distinctive to 'oil-spill' class), the prob. values are low in the whole region.
  • Figure 3: Effect of our SNR-based balancing factor $b$ on the noise corruption process in DDPM ddpm. Without applying $b$, segmentation masks always have more structural information compared to SAR images at the same timestep. With applying $b$, noise corruption is balanced across two modalities, preserving comparable information during joint generation.
  • Figure 4: Data generation results: Qualitative comparison of SAR images and corresponding segmentation masks generated by (a) SemGAN semgan, (b) DDPM ddpm, (c) SatSynth satsynth, and (d) Our DAKTer, compared to (e) samples from the original OSD dataset related_oilspill_compare_all.
  • Figure 5: Comparsion of segmentation performance on related_oilspill_compare_all across different augmentation scales (0, $50\%$, $100\%$, $150\%$, and $200\%$ of the original training set related_oilspill_compare_all) for training SegFormer experiment_segformer with SatSynth satsynth, our DAKTer without KD (soft-label supervision), and our full DAKTer strategy.
  • ...and 7 more figures