CloudTracks: A Dataset for Localizing Ship Tracks in Satellite Images of Clouds
Muhammad Ahmed Chaudhry, Lyna Kim, Jeremy Irvin, Yuzu Ido, Sonia Chu, Jared Thomas Isobe, Andrew Y. Ng, Duncan Watson-Parris
TL;DR
The paper tackles the challenge of localizing ship tracks in satellite imagery to study anthropogenic aerosol effects on clouds. It introduces CloudTracks, a dataset of 3,560 MODIS images with over 12,000 ship-track instances, along with semantic and instance segmentation masks derived from careful labeling and preprocessing. Benchmarking shows state-of-the-art localization performance (IoU up to 61.29 with semantic segmentation) and improved instance counting (MAE as low as 1.64) compared to prior work, while also highlighting the remaining difficulties with thin, overlapping tracks. The work aims to spur new ML approaches for elongated, occluded features in geospatial imagery and to support climate research on aerosol–cloud interactions, with the dataset openly released to the community.
Abstract
Clouds play a significant role in global temperature regulation through their effect on planetary albedo. Anthropogenic emissions of aerosols can alter the albedo of clouds, but the extent of this effect, and its consequent impact on temperature change, remains uncertain. Human-induced clouds caused by ship aerosol emissions, commonly referred to as ship tracks, provide visible manifestations of this effect distinct from adjacent cloud regions and therefore serve as a useful sandbox to study human-induced clouds. However, the lack of large-scale ship track data makes it difficult to deduce their general effects on cloud formation. Towards developing automated approaches to localize ship tracks at scale, we present CloudTracks, a dataset containing 3,560 satellite images labeled with more than 12,000 ship track instance annotations. We train semantic segmentation and instance segmentation model baselines on our dataset and find that our best model substantially outperforms previous state-of-the-art for ship track localization (61.29 vs. 48.65 IoU). We also find that the best instance segmentation model is able to identify the number of ship tracks in each image more accurately than the previous state-of-the-art (1.64 vs. 4.99 MAE). However, we identify cases where the best model struggles to accurately localize and count ship tracks, so we believe CloudTracks will stimulate novel machine learning approaches to better detect elongated and overlapping features in satellite images. We release our dataset openly at {zenodo.org/records/10042922}.
