Table of Contents
Fetching ...

SatFlow: Generative model based framework for producing High Resolution Gap Free Remote Sensing Imagery

Bharath Irigireddy, Varaprasad Bandaru

TL;DR

SatFlow tackles the problem of producing frequent, high-resolution, gap-free surface reflectance imagery by fusing daily MODIS data with Landsat observations using a conditional flow matching framework. It learns a time-varying vector field to transform noise into Landsat-like imagery conditioned on MODIS context and a gap-free Landsat composite, enabling robust cloud and scanline gap imputation. Empirical results show improvements over STARFM and conditional diffusion, with strong performance under heavy cloud occlusion when MODIS information is incorporated, demonstrating the method's potential for long-term environmental monitoring and phenology studies. The approach offers a scalable pipeline with extensibility to additional sensors and modalities, supporting near real-time, high-resolution environmental analysis.

Abstract

Frequent, high-resolution remote sensing imagery is crucial for agricultural and environmental monitoring. Satellites from the Landsat collection offer detailed imagery at 30m resolution but with lower temporal frequency, whereas missions like MODIS and VIIRS provide daily coverage at coarser resolutions. Clouds and cloud shadows contaminate about 55\% of the optical remote sensing observations, posing additional challenges. To address these challenges, we present SatFlow, a generative model-based framework that fuses low-resolution MODIS imagery and Landsat observations to produce frequent, high-resolution, gap-free surface reflectance imagery. Our model, trained via Conditional Flow Matching, demonstrates better performance in generating imagery with preserved structural and spectral integrity. Cloud imputation is treated as an image inpainting task, where the model reconstructs cloud-contaminated pixels and fills gaps caused by scan lines during inference by leveraging the learned generative processes. Experimental results demonstrate the capability of our approach in reliably imputing cloud-covered regions. This capability is crucial for downstream applications such as crop phenology tracking, environmental change detection etc.,

SatFlow: Generative model based framework for producing High Resolution Gap Free Remote Sensing Imagery

TL;DR

SatFlow tackles the problem of producing frequent, high-resolution, gap-free surface reflectance imagery by fusing daily MODIS data with Landsat observations using a conditional flow matching framework. It learns a time-varying vector field to transform noise into Landsat-like imagery conditioned on MODIS context and a gap-free Landsat composite, enabling robust cloud and scanline gap imputation. Empirical results show improvements over STARFM and conditional diffusion, with strong performance under heavy cloud occlusion when MODIS information is incorporated, demonstrating the method's potential for long-term environmental monitoring and phenology studies. The approach offers a scalable pipeline with extensibility to additional sensors and modalities, supporting near real-time, high-resolution environmental analysis.

Abstract

Frequent, high-resolution remote sensing imagery is crucial for agricultural and environmental monitoring. Satellites from the Landsat collection offer detailed imagery at 30m resolution but with lower temporal frequency, whereas missions like MODIS and VIIRS provide daily coverage at coarser resolutions. Clouds and cloud shadows contaminate about 55\% of the optical remote sensing observations, posing additional challenges. To address these challenges, we present SatFlow, a generative model-based framework that fuses low-resolution MODIS imagery and Landsat observations to produce frequent, high-resolution, gap-free surface reflectance imagery. Our model, trained via Conditional Flow Matching, demonstrates better performance in generating imagery with preserved structural and spectral integrity. Cloud imputation is treated as an image inpainting task, where the model reconstructs cloud-contaminated pixels and fills gaps caused by scan lines during inference by leveraging the learned generative processes. Experimental results demonstrate the capability of our approach in reliably imputing cloud-covered regions. This capability is crucial for downstream applications such as crop phenology tracking, environmental change detection etc.,

Paper Structure

This paper contains 14 sections, 7 equations, 3 figures, 3 tables, 3 algorithms.

Figures (3)

  • Figure 1: The framework integrates MODIS and Landsat observations through conditional flow matching to downscale MODIS imagery (500m) to Landsat resolution (30m).
  • Figure 2: The Conditioning input are concatenated along the channel dimension with the current state $x_t$. The current time step $t$ and metadata are encoded via learned embedding and integrated into the network at multiple resolutions. The network predicts the vector field $u_{\theta}(x_t, t, c)$ and MSE loss is computed between the predicted and target vector fields.
  • Figure 3: Example of artifacts introduced by Quality Assessment misclassification. The images on the left show the original cloudy Landsat image, and the images on the right show resulting artifacts in the gap-filled output.