Table of Contents
Fetching ...

SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning

Christopher Yeh, Chenlin Meng, Sherrie Wang, Anne Driscoll, Erik Rozi, Patrick Liu, Jihyeon Lee, Marshall Burke, David B. Lobell, Stefano Ermon

TL;DR

SustainBench addresses critical SDG data gaps by offering a large, standardized suite of ML benchmarks spanning seven SDGs and 15 tasks, leveraging satellite and street-level imagery plus survey-derived labels. It provides 11 publicly released datasets, a public leaderboard, and baseline models to facilitate cross-task benchmarking and method development, including multi-modal and transfer-learning opportunities. The work demonstrates both the potential and current limitations of ML-based SDG measurement, emphasising careful consideration of biases and privacy while encouraging future enhancements in self-supervised and meta-learning approaches. Overall, SustainBench aims to accelerate research and practical deployment of ML methods for monitoring and advancing the SDGs, particularly in data-poor regions.

Abstract

Progress toward the United Nations Sustainable Development Goals (SDGs) has been hindered by a lack of data on key environmental and socioeconomic indicators, which historically have come from ground surveys with sparse temporal and spatial coverage. Recent advances in machine learning have made it possible to utilize abundant, frequently-updated, and globally available data, such as from satellites or social media, to provide insights into progress toward SDGs. Despite promising early results, approaches to using such data for SDG measurement thus far have largely evaluated on different datasets or used inconsistent evaluation metrics, making it hard to understand whether performance is improving and where additional research would be most fruitful. Furthermore, processing satellite and ground survey data requires domain knowledge that many in the machine learning community lack. In this paper, we introduce SustainBench, a collection of 15 benchmark tasks across 7 SDGs, including tasks related to economic development, agriculture, health, education, water and sanitation, climate action, and life on land. Datasets for 11 of the 15 tasks are released publicly for the first time. Our goals for SustainBench are to (1) lower the barriers to entry for the machine learning community to contribute to measuring and achieving the SDGs; (2) provide standard benchmarks for evaluating machine learning models on tasks across a variety of SDGs; and (3) encourage the development of novel machine learning methods where improved model performance facilitates progress towards the SDGs.

SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning

TL;DR

SustainBench addresses critical SDG data gaps by offering a large, standardized suite of ML benchmarks spanning seven SDGs and 15 tasks, leveraging satellite and street-level imagery plus survey-derived labels. It provides 11 publicly released datasets, a public leaderboard, and baseline models to facilitate cross-task benchmarking and method development, including multi-modal and transfer-learning opportunities. The work demonstrates both the potential and current limitations of ML-based SDG measurement, emphasising careful consideration of biases and privacy while encouraging future enhancements in self-supervised and meta-learning approaches. Overall, SustainBench aims to accelerate research and practical deployment of ML methods for monitoring and advancing the SDGs, particularly in data-poor regions.

Abstract

Progress toward the United Nations Sustainable Development Goals (SDGs) has been hindered by a lack of data on key environmental and socioeconomic indicators, which historically have come from ground surveys with sparse temporal and spatial coverage. Recent advances in machine learning have made it possible to utilize abundant, frequently-updated, and globally available data, such as from satellites or social media, to provide insights into progress toward SDGs. Despite promising early results, approaches to using such data for SDG measurement thus far have largely evaluated on different datasets or used inconsistent evaluation metrics, making it hard to understand whether performance is improving and where additional research would be most fruitful. Furthermore, processing satellite and ground survey data requires domain knowledge that many in the machine learning community lack. In this paper, we introduce SustainBench, a collection of 15 benchmark tasks across 7 SDGs, including tasks related to economic development, agriculture, health, education, water and sanitation, climate action, and life on land. Datasets for 11 of the 15 tasks are released publicly for the first time. Our goals for SustainBench are to (1) lower the barriers to entry for the machine learning community to contribute to measuring and achieving the SDGs; (2) provide standard benchmarks for evaluating machine learning models on tasks across a variety of SDGs; and (3) encourage the development of novel machine learning methods where improved model performance facilitates progress towards the SDGs.

Paper Structure

This paper contains 78 sections, 4 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Datasets and tasks included in SustainBench ranging from poverty prediction to land cover classification (described in \ref{['sec:data']} with additional details in \ref{['app:sec:dataset']}). Data for 11 out of 15 tasks are publicly released for the first time.
  • Figure 2: A map of how many SDGs are covered in SustainBench for every country. SustainBench has global coverage with an emphasis on low-income countries. In total, 119 countries have at least one task in SustainBench.
  • Figure A1: Maps of geographic SustainBench coverage per SDG.
  • Figure A2: An example of an input satellite image for the DHS survey-based datasets. This image is of cluster 969 from the 2004 DHS survey of Peru, located at latitude and longitude coordinates of (-12.597851, -69.185416). The left image shows the RGB channels from Landsat surface reflectance. The right image shows the Nightlights band from DMSP.
  • Figure A3: An example of an input street-level image from Mapillary for the DHS survey-based datasets. The left image is from cluster 10 of Armenia located at (40.192860, 44.515051). The right image is from cluster 92 of Benin, located at (2.347327, 6.402679).
  • ...and 9 more figures