Table of Contents
Fetching ...

GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images

Jon Crall, Connor Greenwell, David Joy, Matthew Leotta, Aashish Chaudhary, Anthony Hoogs

TL;DR

GeoWATCH tackles the challenge of learning from long, heterogeneous satellite image time series by building a data-interchange framework (KWCoco) and a video-view representation that unify multi-sensor data. It introduces partial weight loading via maximum subtree embeddings to enable continual model evolution, supporting a lineage of networks that improves performance while reusing core backbones. The two-stage heavy construction detection pipeline (broad-area search followed by high-resolution activity characterization) demonstrates continual gains and practical utility for monitoring anthropogenic processes across large spatial and temporal scales. This framework enables robust, scalable remote sensing analytics and has potential applicability to a broad range of geospatial vision tasks beyond construction detection.

Abstract

Learning from multiple sensors is challenging due to spatio-temporal misalignment and differences in resolution and captured spectra. To that end, we introduce GeoWATCH, a flexible framework for training models on long sequences of satellite images sourced from multiple sensor platforms, which is designed to handle image classification, activity recognition, object detection, or object tracking tasks. Our system includes a novel partial weight loading mechanism based on sub-graph isomorphism which allows for continually training and modifying a network over many training cycles. This has allowed us to train a lineage of models over a long period of time, which we have observed has improved performance as we adjust configurations while maintaining a core backbone.

GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images

TL;DR

GeoWATCH tackles the challenge of learning from long, heterogeneous satellite image time series by building a data-interchange framework (KWCoco) and a video-view representation that unify multi-sensor data. It introduces partial weight loading via maximum subtree embeddings to enable continual model evolution, supporting a lineage of networks that improves performance while reusing core backbones. The two-stage heavy construction detection pipeline (broad-area search followed by high-resolution activity characterization) demonstrates continual gains and practical utility for monitoring anthropogenic processes across large spatial and temporal scales. This framework enables robust, scalable remote sensing analytics and has potential applicability to a broad range of geospatial vision tasks beyond construction detection.

Abstract

Learning from multiple sensors is challenging due to spatio-temporal misalignment and differences in resolution and captured spectra. To that end, we introduce GeoWATCH, a flexible framework for training models on long sequences of satellite images sourced from multiple sensor platforms, which is designed to handle image classification, activity recognition, object detection, or object tracking tasks. Our system includes a novel partial weight loading mechanism based on sub-graph isomorphism which allows for continually training and modifying a network over many training cycles. This has allowed us to train a lineage of models over a long period of time, which we have observed has improved performance as we adjust configurations while maintaining a core backbone.
Paper Structure (9 sections, 5 figures)

This paper contains 9 sections, 5 figures.

Figures (5)

  • Figure 1: GeoWATCH Detection Example. Example prediction for a heavy construction site in the validation dataset. Rows 1 and 2 display true and predicted polygons. Row 3 presents the image data. Rows 4 and 5 feature the 2m GSD phase and saliency heatmaps. Row 6 displays the 10m GSD saliency heatmap. Row 7 compares true and predicted timelines. Category colors include red for "No Activity", yellow for "Site Preparation", green for "Active Construction", and blue for "Post Construction."
  • Figure 2: STAC-to-KWCoco. Given a region of interest, our system runs a STAC query and registers paths to the original images in a KWCoco file. The images are stored natively on disk and we can request heterogeneous subregions of spacetime at arbitrary resampled (or native) resolutions.
  • Figure 3: Prediction Pipeline. Given an enumeration of spacetime sample grids, the input is prepared and passed to a model, which predicts a corresponding set of heatmaps. With the input is an associated set of weights for each pixel, which is zero if the pixel is NaN or low quality as indicated by the QA mask. The heatmap predictions are accumulated into a pre-allocated buffer for for each frame in the larger video. Boundary smoothing weights are used to down-weight edges of each predicted window. When combined with overlapping windows, this results in a smooth final heatmap corresponding to each larger frame in the original video.
  • Figure 4: Partial Weight Loading. A partial matching between similar networks is established by finding a maximum common subtree embedding feruglio_maximum_2003. Unmatched destination weights are reinitialized. In this example the input stem MSI data is dropped, the backbone is extended from 8 layers to 24 layers and partially initialized, a new class head is initialized, and the existing saliency head and RGB stem are exactly copied.
  • Figure 5: Instillation Improved Scoring Over Time. Over an 18-month period, our F1 scores for the IARPA SMART BAS task ($\uparrow$ is better) improved on both the training regions (dashed-blue) and the sequestered test regions (solid-green). Each model is finetuned under different training conditions from its most recent ancestor, initializing each step using our partial weight loading approach. The scores are reported to us from an external evaluation of our system.