xView: Objects in Context in Overhead Imagery
Darius Lam, Richard Kuzma, Kevin McGee, Samuel Dooley, Michael Laielli, Matthew Klaric, Yaroslav Bulatov, Brendan McCord
TL;DR
xView presents a large-scale, multi-class overhead-imagery dataset for object detection, featuring 60 categories and ~1 million labeled instances across 1,400 km^2 of 0.3 m GSD WorldView-3 imagery. It introduces a rigorous three-stage quality-control annotation workflow and a per-image 1 km^2 chip-based collection strategy to maximize diversity and annotation fidelity. A baseline SSD experiment demonstrates the dataset's multi-scale detection challenges, with multi-resolution training delivering the strongest performance among tested variants. The work positions xView as a versatile benchmark for overhead imagery, with avenues for few-shot learning and domain adaptation to broaden real-world applicability.
Abstract
We introduce a new large-scale dataset for the advancement of object detection techniques and overhead object detection research. This satellite imagery dataset enables research progress pertaining to four key computer vision frontiers. We utilize a novel process for geospatial category detection and bounding box annotation with three stages of quality control. Our data is collected from WorldView-3 satellites at 0.3m ground sample distance, providing higher resolution imagery than most public satellite imagery datasets. We compare xView to other object detection datasets in both natural and overhead imagery domains and then provide a baseline analysis using the Single Shot MultiBox Detector. xView is one of the largest and most diverse publicly available object-detection datasets to date, with over 1 million objects across 60 classes in over 1,400 km^2 of imagery.
