Functional Map of the World
Gordon Christie, Neil Fendley, James Wilson, Ryan Mukherjee
TL;DR
The paper introduces the Functional Map of the World (fMoW), a large-scale remote-sensing dataset with over 1 million temporally stacked images, 4/8-band multispectral data, and rich metadata, annotated with 63 categories (including a 'false detection'). It explores joint reasoning over temporal image sequences and metadata using CNN- and LSTM-based baselines, demonstrating that metadata fusion and temporal context improve classification beyond image information alone. The dataset is collected via a three-phase pipeline combining VGI-derived locations and GeoHIVE crowdsourcing, and exists in two variants (fMoW-full and fMoW-rgb) to balance completeness and size. The authors publicize the data, code, and pretrained models, discuss geographic and labeling biases, and highlight potential humanitarian applications such as disaster response, setting a benchmark for multimodal, temporally-aware remote sensing research.
Abstract
We present a new dataset, Functional Map of the World (fMoW), which aims to inspire the development of machine learning models capable of predicting the functional purpose of buildings and land use from temporal sequences of satellite images and a rich set of metadata features. The metadata provided with each image enables reasoning about location, time, sun angles, physical sizes, and other features when making predictions about objects in the image. Our dataset consists of over 1 million images from over 200 countries. For each image, we provide at least one bounding box annotation containing one of 63 categories, including a "false detection" category. We present an analysis of the dataset along with baseline approaches that reason about metadata and temporal views. Our data, code, and pretrained models have been made publicly available.
