Table of Contents
Fetching ...

xView: Objects in Context in Overhead Imagery

Darius Lam, Richard Kuzma, Kevin McGee, Samuel Dooley, Michael Laielli, Matthew Klaric, Yaroslav Bulatov, Brendan McCord

TL;DR

xView presents a large-scale, multi-class overhead-imagery dataset for object detection, featuring 60 categories and ~1 million labeled instances across 1,400 km^2 of 0.3 m GSD WorldView-3 imagery. It introduces a rigorous three-stage quality-control annotation workflow and a per-image 1 km^2 chip-based collection strategy to maximize diversity and annotation fidelity. A baseline SSD experiment demonstrates the dataset's multi-scale detection challenges, with multi-resolution training delivering the strongest performance among tested variants. The work positions xView as a versatile benchmark for overhead imagery, with avenues for few-shot learning and domain adaptation to broaden real-world applicability.

Abstract

We introduce a new large-scale dataset for the advancement of object detection techniques and overhead object detection research. This satellite imagery dataset enables research progress pertaining to four key computer vision frontiers. We utilize a novel process for geospatial category detection and bounding box annotation with three stages of quality control. Our data is collected from WorldView-3 satellites at 0.3m ground sample distance, providing higher resolution imagery than most public satellite imagery datasets. We compare xView to other object detection datasets in both natural and overhead imagery domains and then provide a baseline analysis using the Single Shot MultiBox Detector. xView is one of the largest and most diverse publicly available object-detection datasets to date, with over 1 million objects across 60 classes in over 1,400 km^2 of imagery.

xView: Objects in Context in Overhead Imagery

TL;DR

xView presents a large-scale, multi-class overhead-imagery dataset for object detection, featuring 60 categories and ~1 million labeled instances across 1,400 km^2 of 0.3 m GSD WorldView-3 imagery. It introduces a rigorous three-stage quality-control annotation workflow and a per-image 1 km^2 chip-based collection strategy to maximize diversity and annotation fidelity. A baseline SSD experiment demonstrates the dataset's multi-scale detection challenges, with multi-resolution training delivering the strongest performance among tested variants. The work positions xView as a versatile benchmark for overhead imagery, with avenues for few-shot learning and domain adaptation to broaden real-world applicability.

Abstract

We introduce a new large-scale dataset for the advancement of object detection techniques and overhead object detection research. This satellite imagery dataset enables research progress pertaining to four key computer vision frontiers. We utilize a novel process for geospatial category detection and bounding box annotation with three stages of quality control. Our data is collected from WorldView-3 satellites at 0.3m ground sample distance, providing higher resolution imagery than most public satellite imagery datasets. We compare xView to other object detection datasets in both natural and overhead imagery domains and then provide a baseline analysis using the Single Shot MultiBox Detector. xView is one of the largest and most diverse publicly available object-detection datasets to date, with over 1 million objects across 60 classes in over 1,400 km^2 of imagery.

Paper Structure

This paper contains 12 sections, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Four of the many views of xView. Imagery comes from different geographical locations with different levels of human use. Imagery in this figure is from DigitalGlobe.
  • Figure 2: COWC, PASCAL VOC, and xView cars, respectively cowcpascalvoc. COWC provides object labels as single points corresponding to the center of each car cowc. To generate bounding boxes, we create 20x20 pixel boxes around each point. COWC and xView imagery in this figure are extracted from a single location. xView imagery in this figure is from DigitalGlobe.
  • Figure 3: The QGIS annotation software. Drawn annotations are shown in red. Imagery in this figure is from DigitalGlobe.
  • Figure 4: Top: xView instance count distribution by class. Bottom: xView pixel area distribution by class.
  • Figure 5: Total Instance Count versus Number of Classes for major object detection datasets. Blue indicates overhead imagery datasets; red indicates natural imagery datasets.
  • ...and 7 more figures