KidSat: satellite imagery to map childhood poverty dataset and benchmark
Makkunda Sharma, Fan Yang, Duy-Nhat Vo, Esra Suel, Swapnil Mishra, Samir Bhatt, Oliver Fiala, William Rudgard, Seth Flaxman
TL;DR
The paper introduces KidSat, a dataset linking high-resolution satellite imagery with DHS-derived ground truth on multidimensional child poverty across 19 countries in Eastern and Southern Africa from 1997 to 2022. It benchmarks multiple models, including MOSAIKS, DINOv2, and SatMAE, on spatial and temporal generalization tasks, and provides open-source code for dataset construction and evaluation. Findings show that foundation models, especially when fine-tuned with DHS variables, improve spatial poverty prediction over baselines, while temporal forecasting remains more challenging due to distribution shifts. This work offers a scalable resource for fine-grained poverty mapping and policy analysis, highlighting practical trade-offs between imagery resolution, model type, and compute resources.
Abstract
Satellite imagery has emerged as an important tool to analyse demographic, health, and development indicators. While various deep learning models have been built for these tasks, each is specific to a particular problem, with few standard benchmarks available. We propose a new dataset pairing satellite imagery and high-quality survey data on child poverty to benchmark satellite feature representations. Our dataset consists of 33,608 images, each 10 km $\times$ 10 km, from 19 countries in Eastern and Southern Africa in the time period 1997-2022. As defined by UNICEF, multidimensional child poverty covers six dimensions and it can be calculated from the face-to-face Demographic and Health Surveys (DHS) Program . As part of the benchmark, we test spatial as well as temporal generalization, by testing on unseen locations, and on data after the training years. Using our dataset we benchmark multiple models, from low-level satellite imagery models such as MOSAIKS , to deep learning foundation models, which include both generic vision models such as Self-Distillation with no Labels (DINOv2) models and specific satellite imagery models such as SatMAE. We provide open source code for building the satellite dataset, obtaining ground truth data from DHS and running various models assessed in our work.
