Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types
Jordi Laguarta Soler, Thomas Friedel, Sherrie Wang
TL;DR
This work tackles the shortage of crop-type maps in low- and middle-income regions by integrating street-view imagery with remote sensing. It introduces an automated pipeline that auto-generates field coordinates from OpenStreetMap, filters and orients street-view images toward crop fields, and uses weakly-labeled data (from WebCC, iNaturalist, and GPT-4V) to train street-view crop-type classifiers; these labels then train a Sentinel-2-based remote-sensing model to produce wall-to-wall crop-type maps. In Thailand, the approach yields a 10 m resolution map covering rice, cassava, maize, and sugarcane with 93% overall accuracy, alongside a public release of 81,000 ground-reference labels and the full 2022 crop map. The study demonstrates that weak supervision, including GPT-4V labeling, can substantially reduce expert labeling needs while maintaining high performance, offering a scalable path to crop mapping in underserved regions. Limitations include dependence on street-view data availability, update frequency, and domain transfer considerations, but the framework generalizes to other regions and street-level imagery sources.
Abstract
Accurate crop type maps are an essential source of information for monitoring yield progress at scale, projecting global crop production, and planning effective policies. To date, however, crop type maps remain challenging to create in low and middle-income countries due to a lack of ground truth labels for training machine learning models. Field surveys are the gold standard in terms of accuracy but require an often-prohibitively large amount of time, money, and statistical capacity. In recent years, street-level imagery, such as Google Street View, KartaView, and Mapillary, has become available around the world. Such imagery contains rich information about crop types grown at particular locations and times. In this work, we develop an automated system to generate crop type ground references using deep learning and Google Street View imagery. The method efficiently curates a set of street view images containing crop fields, trains a model to predict crop type by utilizing weakly-labelled images from disparate out-of-domain sources, and combines predicted labels with remote sensing time series to create a wall-to-wall crop type map. We show that, in Thailand, the resulting country-wide map of rice, cassava, maize, and sugarcane achieves an accuracy of 93%. We publicly release the first-ever crop type map for all of Thailand 2022 at 10m-resolution with no gaps. To our knowledge, this is the first time a 10m-resolution, multi-crop map has been created for any smallholder country. As the availability of roadside imagery expands, our pipeline provides a way to map crop types at scale around the globe, especially in underserved smallholder regions.
