Table of Contents
Fetching ...

Oya: Deep Learning for Accurate Global Precipitation Estimation

Emmanuel Asiedu Brempong, Mohammed Alewi Hassen, MohamedElfatih MohamedKhair, Vusumuzi Dube, Santiago Hincapie Potes, Olivia Graham, Amanie Brik, Amy McGovern, George J. Huffman, Jason Hickey

TL;DR

The paper tackles the global need for accurate, timely precipitation estimates in data-sparse regions by introducing Oya, a two-stage UNet framework that leverages the full VIS–IR spectrum from geostationary satellites. By training against high-quality CORRA v07 data and pretraining on IMERG Final, the approach effectively mitigates extreme class imbalance and achieves quasi-global coverage across multiple GEO satellites. Results show that Oya outperforms GEO-only baselines and approaches the accuracy of real-time IMERG Early, while remaining competitive with the research-grade IMERG Final, particularly for light to moderate precipitation. Ablations confirm the value of using all GEO channels, data augmentation, patch context, and LDS losses, and the authors provide a publicly available quasi-global precipitation dataset generated from the models.

Abstract

Accurate precipitation estimation is critical for hydrological applications, especially in the Global South where ground-based observation networks are sparse and forecasting skill is limited. Existing satellite-based precipitation products often rely on the longwave infrared channel alone or are calibrated with data that can introduce significant errors, particularly at sub-daily timescales. This study introduces Oya, a novel real-time precipitation retrieval algorithm utilizing the full spectrum of visible and infrared (VIS-IR) observations from geostationary (GEO) satellites. Oya employs a two-stage deep learning approach, combining two U-Net models: one for precipitation detection and another for quantitative precipitation estimation (QPE), to address the inherent data imbalance between rain and no-rain events. The models are trained using high-resolution GPM Combined Radar-Radiometer Algorithm (CORRA) v07 data as ground truth and pre-trained on IMERG-Final retrievals to enhance robustness and mitigate overfitting due to the limited temporal sampling of CORRA. By leveraging multiple GEO satellites, Oya achieves quasi-global coverage and demonstrates superior performance compared to existing competitive regional and global precipitation baselines, offering a promising pathway to improved precipitation monitoring and forecasting.

Oya: Deep Learning for Accurate Global Precipitation Estimation

TL;DR

The paper tackles the global need for accurate, timely precipitation estimates in data-sparse regions by introducing Oya, a two-stage UNet framework that leverages the full VIS–IR spectrum from geostationary satellites. By training against high-quality CORRA v07 data and pretraining on IMERG Final, the approach effectively mitigates extreme class imbalance and achieves quasi-global coverage across multiple GEO satellites. Results show that Oya outperforms GEO-only baselines and approaches the accuracy of real-time IMERG Early, while remaining competitive with the research-grade IMERG Final, particularly for light to moderate precipitation. Ablations confirm the value of using all GEO channels, data augmentation, patch context, and LDS losses, and the authors provide a publicly available quasi-global precipitation dataset generated from the models.

Abstract

Accurate precipitation estimation is critical for hydrological applications, especially in the Global South where ground-based observation networks are sparse and forecasting skill is limited. Existing satellite-based precipitation products often rely on the longwave infrared channel alone or are calibrated with data that can introduce significant errors, particularly at sub-daily timescales. This study introduces Oya, a novel real-time precipitation retrieval algorithm utilizing the full spectrum of visible and infrared (VIS-IR) observations from geostationary (GEO) satellites. Oya employs a two-stage deep learning approach, combining two U-Net models: one for precipitation detection and another for quantitative precipitation estimation (QPE), to address the inherent data imbalance between rain and no-rain events. The models are trained using high-resolution GPM Combined Radar-Radiometer Algorithm (CORRA) v07 data as ground truth and pre-trained on IMERG-Final retrievals to enhance robustness and mitigate overfitting due to the limited temporal sampling of CORRA. By leveraging multiple GEO satellites, Oya achieves quasi-global coverage and demonstrates superior performance compared to existing competitive regional and global precipitation baselines, offering a promising pathway to improved precipitation monitoring and forecasting.

Paper Structure

This paper contains 29 sections, 7 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Example creation pipeline. (Left) shows a false color image of the Meteosat $0^{\circ}$ observation for April, 9 2022, 13:15 UTC, overlaid with the GPM CORRA precipitation swath between the start and end time of the MSG snapshot. (Middle) The patches produced by the example creation pipeline. Valid patches, that is, those that have corresponding GPM CORRA observations are highlighted in red. (Right) An example input-output pair showing the Meteosat $0^{\circ}$ observations (false color image) overlaid with the corresponding GPM CORRA precipitation retrieval.
  • Figure 2: Distribution of GPM CORRA v07 observations. (a) Distribution of no-precipitation, light, medium, heavy and extreme precipitation events, showing the imbalance between the no-precipitation and precipitation observations. (b) Distribution of precipitation events from (a) showing that in the absence of no-precipitation observations, light precipitation observations heavily outweigh the medium to extreme precipitation events.
  • Figure 3: (a) The Oya precipitation estimation model consisting of two UNet models: a classification model to detect precipitation events, and a regression model to estimate the actual amount of precipitation. (b) the UNet architecture used for each model, which consists of an encoder and a decoder with skip connections.
  • Figure 4: Retrieval accuracy metrics of model trained over Africa against GPM CORRA observations in 2022. (a) Critical Success Index (CSI) for Oya, IMERG Final, IMERG Early, PDIR and CRR retrievals at different precipitation intensities. (b) CSI, Probability of Detection (POD), False Alarm Ratio (FAR) and Bias for each retrieval at a threshold of 0.2$mm\:h^{-1}$. The best result for each metric is in bold font.
  • Figure 5: As in \ref{['fig:africa_csi']} but for the model trained over Meteosat 0$^{\circ}$ coverage.
  • ...and 6 more figures