Table of Contents
Fetching ...

CuMoLoS-MAE: A Masked Autoencoder for Remote Sensing Data Reconstruction

Anurup Naskar, Nathanael Zhixin Wong, Sara Shamekh

TL;DR

CuMoLoS-MAE tackles noisy remote-sensing atmospheric profiling by introducing a curriculum-guided masked autoencoder with Monte Carlo ensembling. It uses micro-patch MAE within a ViT framework to reconstruct fine-scale vertical velocity fields and to produce per-pixel uncertainty maps, with uncertainty estimated by averaging over $N$ random masks ($ar{X} = \frac{1}{N}\sum \hat{X}^{(i)}$, $\sigma_X = \sqrt{\frac{1}{N}\sum (\hat{X}^{(i)}-\bar{X})^2}$). The approach achieves state-of-the-art reconstruction quality and reliable uncertainty estimates on Doppler lidar data from ARM SGP, while revealing the trade-offs between temporal context and spectral fidelity. This enables improved convection diagnostics, real-time data assimilation, and more robust long-term climate reanalysis, with potential for generalization across lidar systems and operational deployment. $

Abstract

Accurate atmospheric profiles from remote sensing instruments such as Doppler Lidar, Radar, and radiometers are frequently corrupted by low-SNR (Signal to Noise Ratio) gates, range folding, and spurious discontinuities. Traditional gap filling blurs fine-scale structures, whereas deep models lack confidence estimates. We present CuMoLoS-MAE, a Curriculum-Guided Monte Carlo Stochastic Ensemble Masked Autoencoder designed to (i) restore fine-scale features such as updraft and downdraft cores, shear lines, and small vortices, (ii) learn a data-driven prior over atmospheric fields, and (iii) quantify pixel-wise uncertainty. During training, CuMoLoS-MAE employs a mask-ratio curriculum that forces a ViT decoder to reconstruct from progressively sparser context. At inference, we approximate the posterior predictive by Monte Carlo over random mask realisations, evaluating the MAE multiple times and aggregating the outputs to obtain the posterior predictive mean reconstruction together with a finely resolved per-pixel uncertainty map. Together with high-fidelity reconstruction, this novel deep learning-based workflow enables enhanced convection diagnostics, supports real-time data assimilation, and improves long-term climate reanalysis.

CuMoLoS-MAE: A Masked Autoencoder for Remote Sensing Data Reconstruction

TL;DR

CuMoLoS-MAE tackles noisy remote-sensing atmospheric profiling by introducing a curriculum-guided masked autoencoder with Monte Carlo ensembling. It uses micro-patch MAE within a ViT framework to reconstruct fine-scale vertical velocity fields and to produce per-pixel uncertainty maps, with uncertainty estimated by averaging over random masks (, ). The approach achieves state-of-the-art reconstruction quality and reliable uncertainty estimates on Doppler lidar data from ARM SGP, while revealing the trade-offs between temporal context and spectral fidelity. This enables improved convection diagnostics, real-time data assimilation, and more robust long-term climate reanalysis, with potential for generalization across lidar systems and operational deployment. $

Abstract

Accurate atmospheric profiles from remote sensing instruments such as Doppler Lidar, Radar, and radiometers are frequently corrupted by low-SNR (Signal to Noise Ratio) gates, range folding, and spurious discontinuities. Traditional gap filling blurs fine-scale structures, whereas deep models lack confidence estimates. We present CuMoLoS-MAE, a Curriculum-Guided Monte Carlo Stochastic Ensemble Masked Autoencoder designed to (i) restore fine-scale features such as updraft and downdraft cores, shear lines, and small vortices, (ii) learn a data-driven prior over atmospheric fields, and (iii) quantify pixel-wise uncertainty. During training, CuMoLoS-MAE employs a mask-ratio curriculum that forces a ViT decoder to reconstruct from progressively sparser context. At inference, we approximate the posterior predictive by Monte Carlo over random mask realisations, evaluating the MAE multiple times and aggregating the outputs to obtain the posterior predictive mean reconstruction together with a finely resolved per-pixel uncertainty map. Together with high-fidelity reconstruction, this novel deep learning-based workflow enables enhanced convection diagnostics, supports real-time data assimilation, and improves long-term climate reanalysis.

Paper Structure

This paper contains 11 sections, 2 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: From Doppler lidar time–height arrays we form $64\times64$ images. During training we randomly hide a subset of patch tokens (the mask ratio increases from 0.5 to 0.7 over a curriculum), add positional encodings, pass the visible tokens through a ViT encoder, and reconstruct the full field with a lightweight decoder. The loss is computed only on the hidden pixels. At test time, for each unseen patch we draw 50 independent random masks and run the same pipeline to produce multiple reconstructions. We then average these reconstructions to obtain a single best denoised estimate of the field and use the pixel-wise spread of the ensemble as an uncertainty map.
  • Figure 2: Visual diagnostics for one sample. Left: log-PSD at selected range gates—reconstruction matches the original (small gaps) except at Gate 0 near the boundary-layer base. Centre-left: original Doppler-lidar vertical velocity. Centre-right: CuMoLoS-MAE reconstruction. Right: per-pixel uncertainty map $\sigma_X$.