Table of Contents
Fetching ...

Scalable Higher Resolution Polar Sea Ice Classification and Freeboard Calculation from ICESat-2 ATL03 Data

Jurdana Masuma Iqrah, Younghyun Koo, Wei Wang, Hongjie Xie, Sushil K. Prasad

TL;DR

This work addresses the need for higher-resolution sea ice surface height and freeboard information beyond the ATL07/ATL10 products by reprocessing ICESat-2 ATL03 data at 2 m resolution. It couples Sentinel-2–based auto-labeling with deep learning (LSTM and MLP) to classify ATL03 segments into thick ice, thin ice, and open water, and uses Horovod for distributed training to scale on multi-GPU clusters. The authors also implement PySpark-based parallelization for auto-labeling and freeboard computation, achieving up to 16.25x auto-labeling speedups and 8.5x data-loading plus 15.7x map-reduce speedups for freeboard, with the LSTM model reaching 96.56% accuracy versus 91.80% for the MLP. The resulting high-resolution local sea surface height and freeboard products offer improved representations of sea ice dynamics in polar regions, demonstrating scalable methods that could enable polar-wide products in a cloud-enabled pipeline.

Abstract

ICESat-2 (IS2) by NASA is an Earth-observing satellite that measures high-resolution surface elevation. The IS2's ATL07 and ATL10 sea ice elevation and freeboard products of 10m-200m segments which aggregated 150 signal photons from the raw ATL03 (geolocated photon) data. These aggregated products can potentially overestimate local sea surface height, thus underestimating the calculations of freeboard (sea ice height above sea surface). To achieve a higher resolution of sea surface height and freeboard information, in this work we utilize a 2m window to resample the ATL03 data. Then, we classify these 2m segments into thick sea ice, thin ice, and open water using deep learning methods (Long short-term memory and Multi-layer perceptron models). To obtain labeled training data for our deep learning models, we use segmented Sentinel-2 (S2) multi-spectral imagery overlapping with IS2 tracks in space and time to auto-label IS2 data, followed by some manual corrections in the regions of transition between different ice/water types or cloudy regions. We employ a parallel workflow for this auto-labeling using PySpark to scale, and we achieve 9-fold data loading and 16.25-fold map-reduce speedup. To train our models, we employ a Horovod-based distributed deep-learning workflow on a DGX A100 8 GPU cluster, achieving a 7.25-fold speedup. Next, we calculate the local sea surface heights based on the open water segments. Finally, we scale the freeboard calculation using the derived local sea level and achieve 8.54-fold data loading and 15.7-fold map-reduce speedup. Compared with the ATL07 (local sea level) and ATL10 (freeboard) data products, our results show higher resolutions and accuracy (96.56%).

Scalable Higher Resolution Polar Sea Ice Classification and Freeboard Calculation from ICESat-2 ATL03 Data

TL;DR

This work addresses the need for higher-resolution sea ice surface height and freeboard information beyond the ATL07/ATL10 products by reprocessing ICESat-2 ATL03 data at 2 m resolution. It couples Sentinel-2–based auto-labeling with deep learning (LSTM and MLP) to classify ATL03 segments into thick ice, thin ice, and open water, and uses Horovod for distributed training to scale on multi-GPU clusters. The authors also implement PySpark-based parallelization for auto-labeling and freeboard computation, achieving up to 16.25x auto-labeling speedups and 8.5x data-loading plus 15.7x map-reduce speedups for freeboard, with the LSTM model reaching 96.56% accuracy versus 91.80% for the MLP. The resulting high-resolution local sea surface height and freeboard products offer improved representations of sea ice dynamics in polar regions, demonstrating scalable methods that could enable polar-wide products in a cloud-enabled pipeline.

Abstract

ICESat-2 (IS2) by NASA is an Earth-observing satellite that measures high-resolution surface elevation. The IS2's ATL07 and ATL10 sea ice elevation and freeboard products of 10m-200m segments which aggregated 150 signal photons from the raw ATL03 (geolocated photon) data. These aggregated products can potentially overestimate local sea surface height, thus underestimating the calculations of freeboard (sea ice height above sea surface). To achieve a higher resolution of sea surface height and freeboard information, in this work we utilize a 2m window to resample the ATL03 data. Then, we classify these 2m segments into thick sea ice, thin ice, and open water using deep learning methods (Long short-term memory and Multi-layer perceptron models). To obtain labeled training data for our deep learning models, we use segmented Sentinel-2 (S2) multi-spectral imagery overlapping with IS2 tracks in space and time to auto-label IS2 data, followed by some manual corrections in the regions of transition between different ice/water types or cloudy regions. We employ a parallel workflow for this auto-labeling using PySpark to scale, and we achieve 9-fold data loading and 16.25-fold map-reduce speedup. To train our models, we employ a Horovod-based distributed deep-learning workflow on a DGX A100 8 GPU cluster, achieving a 7.25-fold speedup. Next, we calculate the local sea surface heights based on the open water segments. Finally, we scale the freeboard calculation using the derived local sea level and achieve 8.54-fold data loading and 15.7-fold map-reduce speedup. Compared with the ATL07 (local sea level) and ATL10 (freeboard) data products, our results show higher resolutions and accuracy (96.56%).

Paper Structure

This paper contains 30 sections, 1 equation, 11 figures, 5 tables.

Figures (11)

  • Figure 1: ATL03 Sea Ice Classification and Freeboard Computation Workflow
  • Figure 2: Auto-labeling of IS2 (line) elevations into thick sea ice, thin ice, and open water based on S2 (image) classified surface types: (a) IS2 track elevation over S2 image, (b) Auto-labeling IS2 surface types using S2 classified surface types, and (c) Auto-labeled IS2 surface types over S2 image.
  • Figure 3: Workflow for IS2 sea ice classification deep learning model inferencing.
  • Figure 4: Sea-ice Classification Confusion Matrix
  • Figure 5: Distributed model training via Horovod framework, (a) distributed training speedup, (b) total training time over multiple GPUs, (c) data processed per second for each epoch and (d) time for each epoch.
  • ...and 6 more figures