Table of Contents
Fetching ...

Whale Detection Enhancement through Synthetic Satellite Images

Akshaj Gaur, Cheng Liu, Xiaomin Lin, Nare Karapetyan, Yiannis Aloimonos

TL;DR

This work tackles the data scarcity barrier in whale detection within aerial and satellite imagery by introducing SeaDroneSim2, a Blender-based simulator that generates photorealistic maritime scenes with ground-truth segmentation masks. It offers a configurable pipeline to render varying water properties, lighting, turbidity, and camera parameters, producing datasets that can augment limited real data. Empirical results show that combining synthetic data with a small amount of real data significantly improves detection/segmentation performance (up to around 15% relative gains in $DR$ metrics) over using real data alone, while also highlighting some limits of sim-to-real transfer when real data is abundant. The authors open-source both the SeaDroneSim2 tool and the generated dataset, enabling broader adoption and extension to additional marine objects and under challenging conditions such as ice cover.

Abstract

With a number of marine populations in rapid decline, collecting and analyzing data about marine populations has become increasingly important to develop effective conservation policies for a wide range of marine animals, including whales. Modern computer vision algorithms allow us to detect whales in images in a wide range of domains, further speeding up and enhancing the monitoring process. However, these algorithms heavily rely on large training datasets, which are challenging and time-consuming to collect particularly in marine or aquatic environments. Recent advances in AI however have made it possible to synthetically create datasets for training machine learning algorithms, thus enabling new solutions that were not possible before. In this work, we present a solution - SeaDroneSim2 benchmark suite, which addresses this challenge by generating aerial, and satellite synthetic image datasets to improve the detection of whales and reduce the effort required for training data collection. We show that we can achieve a 15% performance boost on whale detection compared to using the real data alone for training, by augmenting a 10% real data. We open source both the code of the simulation platform SeaDroneSim2 and the dataset generated through it.

Whale Detection Enhancement through Synthetic Satellite Images

TL;DR

This work tackles the data scarcity barrier in whale detection within aerial and satellite imagery by introducing SeaDroneSim2, a Blender-based simulator that generates photorealistic maritime scenes with ground-truth segmentation masks. It offers a configurable pipeline to render varying water properties, lighting, turbidity, and camera parameters, producing datasets that can augment limited real data. Empirical results show that combining synthetic data with a small amount of real data significantly improves detection/segmentation performance (up to around 15% relative gains in metrics) over using real data alone, while also highlighting some limits of sim-to-real transfer when real data is abundant. The authors open-source both the SeaDroneSim2 tool and the generated dataset, enabling broader adoption and extension to additional marine objects and under challenging conditions such as ice cover.

Abstract

With a number of marine populations in rapid decline, collecting and analyzing data about marine populations has become increasingly important to develop effective conservation policies for a wide range of marine animals, including whales. Modern computer vision algorithms allow us to detect whales in images in a wide range of domains, further speeding up and enhancing the monitoring process. However, these algorithms heavily rely on large training datasets, which are challenging and time-consuming to collect particularly in marine or aquatic environments. Recent advances in AI however have made it possible to synthetically create datasets for training machine learning algorithms, thus enabling new solutions that were not possible before. In this work, we present a solution - SeaDroneSim2 benchmark suite, which addresses this challenge by generating aerial, and satellite synthetic image datasets to improve the detection of whales and reduce the effort required for training data collection. We show that we can achieve a 15% performance boost on whale detection compared to using the real data alone for training, by augmenting a 10% real data. We open source both the code of the simulation platform SeaDroneSim2 and the dataset generated through it.
Paper Structure (15 sections, 1 equation, 4 figures, 1 table)

This paper contains 15 sections, 1 equation, 4 figures, 1 table.

Figures (4)

  • Figure 1: First row presents real whale images taken from space while the second row is the simulated whale images from using SeaDroneSim2.
  • Figure 2: An overview of our approach. (a) Assets: Loads the assets such as water properties, objects, materials, etc. into SeaDroneSim2 to generate Synthetic datasets. Note, the synthetic dataset would include its ground truth mask for the object of interest. (b) We modify the properties within the scene such as noise level, rotation of the object, altitude of the camera and etc. (c)The synthetic dataset generated is then fed into a Neural network to obtain the object detection result. We demonstrated the generation of aerial, satellite, and underwater images for two different objects of interest in our study. Note: Object Detection images are cropped and enlarged for better visualization.
  • Figure 3: These are the synthetic images generated from SeaDroneSim2. In the first row, the first two images showcase the increasing turbidity of the water, and the last two images depict varying lighting conditions. In the second row, the first two images display different watercolors, while the last two images exhibit increasing noise from the satellite images. In the third row, the first two images demonstrate varying altitudes, while the last two images illustrate different whale positions, including lodging, spyhopping, and submerging. In the last row, the first two images demonstrate synthetic images with different water waves. while the last two images illustrate different rocks and hills.
  • Figure 4: From left to right: Sample real input image, ground truth, segmentation result using Unet without synthetic augmented real data, segmentation result using Unet with synthetic augmented real data, segmentation result using FPN without synthetic augmented real data, segmentation result using FPN with synthetic augmented real data. All networks here are trained with only 10% of real data.