Table of Contents
Fetching ...

RiverScope: High-Resolution River Masking Dataset

Rangel Daroya, Taylor Rowley, Jonathan Flores, Elisa Friedmann, Fiona Bennitt, Heejin An, Travis Simmons, Marissa Jean Hughes, Camryn L Kluetmeier, Solomon Kica, J. Daniel Vélez, Sarah E. Esenther, Thomas E. Howard, Yanqi Ye, Audrey Turcotte, Colin Gleason, Subhransu Maji

TL;DR

RiverScope addresses the need for fine-scale river monitoring by delivering a global 3 m/pixel PlanetScope dataset with expert water masks (1,145 images over 2,577 km^2) co-registered to SWOT, SWORD, and Sentinel-2 for cross-sensor benchmarking. The work benchmarks 27 segmentation and width-estimation models across architectures and pretraining regimes, and introduces a global river width benchmark achieving a median error of 7.2 meters, far outperforming Landsat, Sentinel, and SWOT-derived widths. It demonstrates that 4-channel multispectral inputs with learned linear adaptors and high-resolution training yield state-of-the-art segmentation and width estimates, while also analyzing cost-accuracy trade-offs among sensors. This resource enables fine-scale hydrological modeling, supports climate adaptation, and invites the ML community to advance multi-sensor river monitoring.

Abstract

Surface water dynamics play a critical role in Earth's climate system, influencing ecosystems, agriculture, disaster resilience, and sustainable development. Yet monitoring rivers and surface water at fine spatial and temporal scales remains challenging -- especially for narrow or sediment-rich rivers that are poorly captured by low-resolution satellite data. To address this, we introduce RiverScope, a high-resolution dataset developed through collaboration between computer science and hydrology experts. RiverScope comprises 1,145 high-resolution images (covering 2,577 square kilometers) with expert-labeled river and surface water masks, requiring over 100 hours of manual annotation. Each image is co-registered with Sentinel-2, SWOT, and the SWOT River Database (SWORD), enabling the evaluation of cost-accuracy trade-offs across sensors -- a key consideration for operational water monitoring. We also establish the first global, high-resolution benchmark for river width estimation, achieving a median error of 7.2 meters -- significantly outperforming existing satellite-derived methods. We extensively evaluate deep networks across multiple architectures (e.g., CNNs and transformers), pretraining strategies (e.g., supervised and self-supervised), and training datasets (e.g., ImageNet and satellite imagery). Our best-performing models combine the benefits of transfer learning with the use of all the multispectral PlanetScope channels via learned adaptors. RiverScope provides a valuable resource for fine-scale and multi-sensor hydrological modeling, supporting climate adaptation and sustainable water management.

RiverScope: High-Resolution River Masking Dataset

TL;DR

RiverScope addresses the need for fine-scale river monitoring by delivering a global 3 m/pixel PlanetScope dataset with expert water masks (1,145 images over 2,577 km^2) co-registered to SWOT, SWORD, and Sentinel-2 for cross-sensor benchmarking. The work benchmarks 27 segmentation and width-estimation models across architectures and pretraining regimes, and introduces a global river width benchmark achieving a median error of 7.2 meters, far outperforming Landsat, Sentinel, and SWOT-derived widths. It demonstrates that 4-channel multispectral inputs with learned linear adaptors and high-resolution training yield state-of-the-art segmentation and width estimates, while also analyzing cost-accuracy trade-offs among sensors. This resource enables fine-scale hydrological modeling, supports climate adaptation, and invites the ML community to advance multi-sensor river monitoring.

Abstract

Surface water dynamics play a critical role in Earth's climate system, influencing ecosystems, agriculture, disaster resilience, and sustainable development. Yet monitoring rivers and surface water at fine spatial and temporal scales remains challenging -- especially for narrow or sediment-rich rivers that are poorly captured by low-resolution satellite data. To address this, we introduce RiverScope, a high-resolution dataset developed through collaboration between computer science and hydrology experts. RiverScope comprises 1,145 high-resolution images (covering 2,577 square kilometers) with expert-labeled river and surface water masks, requiring over 100 hours of manual annotation. Each image is co-registered with Sentinel-2, SWOT, and the SWOT River Database (SWORD), enabling the evaluation of cost-accuracy trade-offs across sensors -- a key consideration for operational water monitoring. We also establish the first global, high-resolution benchmark for river width estimation, achieving a median error of 7.2 meters -- significantly outperforming existing satellite-derived methods. We extensively evaluate deep networks across multiple architectures (e.g., CNNs and transformers), pretraining strategies (e.g., supervised and self-supervised), and training datasets (e.g., ImageNet and satellite imagery). Our best-performing models combine the benefits of transfer learning with the use of all the multispectral PlanetScope channels via learned adaptors. RiverScope provides a valuable resource for fine-scale and multi-sensor hydrological modeling, supporting climate adaptation and sustainable water management.

Paper Structure

This paper contains 34 sections, 2 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: RiverScope presents a global, high-resolution satellite image dataset focused on rivers using PlanetScope planetlabs and co-registered with SWOT vinogradova2025new, SWORD altenau2021surface, and Sentinel-2 esa2022sentinel. To the left we show the distribution and splits of our expert-labeled dataset, covering various geographic and hydrological contexts.
  • Figure 2: RiverScope can be used to precisely segment rivers and water bodies (a-b). Existing low-resolution images like Sentinel (c) tend to over segment narrow rivers, inflating river width estimates due to less detail in the images. Yellow dots mark SWORD nodes; the orange line represents a section of the SWORD reach used as the river centerline.
  • Figure 3: Segmentation performance of different methods for adapting a 4-channel satellite image to RGB to utilize existing RGB pretrained models. 'Drop' refers to dropping the NIR channel, 'Linear' refers to applying a linear layer to convert 4 channels to 3 channels, and 'Random' refers to training 4-channel models without any pretraining applied. We find that linearly projecting the input from 4-channels to 3-channels worked best (raw numbers in Table \ref{['table:supp-adaptor-quantitative']} (Appendix)).
  • Figure 4: RiverScope trained models more accurately segment river pixels compared to Sentinel trained models. Each subplot shows the average F1 score improvement of a given segmentation model across multiple runs. For each architecture and pretraining combination, hatched bars represent the performance of Sentinel trained models, while solid bars represent the performance of RiverScope trained models. We show raw numbers in Table \ref{['table:river-segmentation-results']}, \ref{['table:supp-river-segmentation-results-planet-baselines']} (Appendix).
  • Figure 5: Distribution of width estimates. The RiverScope model predicted widths that cluster closely to the $y=x$ line.
  • ...and 6 more figures