Table of Contents
Fetching ...

Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

Pratinav Seth, Michelle Lin, Brefo Dwamena Yaw, Jade Boutot, Mary Kang, David Rolnick

TL;DR

This paper introduces the Alberta Wells Dataset, the first large-scale benchmark for pinpointing oil and gas wells, including abandoned and suspended ones, using medium-resolution satellite imagery. By combining PlanetScope RGBN data with expert-verified AER ST37 ground truth, it provides segmentation maps and COCO-format bounding boxes across over 213,000 wells in Alberta. The authors evaluate a broad set of baseline models for both binary segmentation and object detection, finding that segmentation benefits from larger backbones and near-infrared data, while detection is led by transformer-based approaches like DETR. The dataset enables scalable monitoring of methane emissions and groundwater contamination, supporting climate-action efforts and future discovery of undocumented wells, with open-access data and benchmarking code planned.

Abstract

Millions of abandoned oil and gas wells are scattered across the world, leaching methane into the atmosphere and toxic compounds into the groundwater. Many of these locations are unknown, preventing the wells from being plugged and their polluting effects averted. Remote sensing is a relatively unexplored tool for pinpointing abandoned wells at scale. We introduce the first large-scale benchmark dataset for this problem, leveraging medium-resolution multi-spectral satellite imagery from Planet Labs. Our curated dataset comprises over 213,000 wells (abandoned, suspended, and active) from Alberta, a region with especially high well density, sourced from the Alberta Energy Regulator and verified by domain experts. We evaluate baseline algorithms for well detection and segmentation, showing the promise of computer vision approaches but also significant room for improvement.

Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

TL;DR

This paper introduces the Alberta Wells Dataset, the first large-scale benchmark for pinpointing oil and gas wells, including abandoned and suspended ones, using medium-resolution satellite imagery. By combining PlanetScope RGBN data with expert-verified AER ST37 ground truth, it provides segmentation maps and COCO-format bounding boxes across over 213,000 wells in Alberta. The authors evaluate a broad set of baseline models for both binary segmentation and object detection, finding that segmentation benefits from larger backbones and near-infrared data, while detection is led by transformer-based approaches like DETR. The dataset enables scalable monitoring of methane emissions and groundwater contamination, supporting climate-action efforts and future discovery of undocumented wells, with open-access data and benchmarking code planned.

Abstract

Millions of abandoned oil and gas wells are scattered across the world, leaching methane into the atmosphere and toxic compounds into the groundwater. Many of these locations are unknown, preventing the wells from being plugged and their polluting effects averted. Remote sensing is a relatively unexplored tool for pinpointing abandoned wells at scale. We introduce the first large-scale benchmark dataset for this problem, leveraging medium-resolution multi-spectral satellite imagery from Planet Labs. Our curated dataset comprises over 213,000 wells (abandoned, suspended, and active) from Alberta, a region with especially high well density, sourced from the Alberta Energy Regulator and verified by domain experts. We evaluate baseline algorithms for well detection and segmentation, showing the promise of computer vision approaches but also significant room for improvement.

Paper Structure

This paper contains 36 sections, 11 figures, 15 tables, 1 algorithm.

Figures (11)

  • Figure 1: Distribution of the number of individual wells in positive samples from the dataset. We also include an equal number of images with no wells at all.
  • Figure 2: Illustration of the outcome of applying our dataset splitting algorithm: In Figures (a) to (c), different colors represent various cluster IDs. In Figure (d), blue refers to the training set, orange to the validation set, and green to the test set.
  • Figure 3: Sample image patches from our dataset includes examples with no wells, two wells, and multiple wells. Additionally, we present qualitative results with predictions generated by our Segmentation U-Net (ResNet50) and DETR ResNet50 model.
  • Figure 4: A few Sample image patches from our dataset presenting qualitative results with predictions of failure cases generated by our Segmentation U-Net and DETR with ResNet50 model.
  • Figure 5: Flowchart depicting our process for AER ST37 dataset cleaning and quality control.
  • ...and 6 more figures