Table of Contents
Fetching ...

Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types

Jordi Laguarta Soler, Thomas Friedel, Sherrie Wang

TL;DR

This work tackles the shortage of crop-type maps in low- and middle-income regions by integrating street-view imagery with remote sensing. It introduces an automated pipeline that auto-generates field coordinates from OpenStreetMap, filters and orients street-view images toward crop fields, and uses weakly-labeled data (from WebCC, iNaturalist, and GPT-4V) to train street-view crop-type classifiers; these labels then train a Sentinel-2-based remote-sensing model to produce wall-to-wall crop-type maps. In Thailand, the approach yields a 10 m resolution map covering rice, cassava, maize, and sugarcane with 93% overall accuracy, alongside a public release of 81,000 ground-reference labels and the full 2022 crop map. The study demonstrates that weak supervision, including GPT-4V labeling, can substantially reduce expert labeling needs while maintaining high performance, offering a scalable path to crop mapping in underserved regions. Limitations include dependence on street-view data availability, update frequency, and domain transfer considerations, but the framework generalizes to other regions and street-level imagery sources.

Abstract

Accurate crop type maps are an essential source of information for monitoring yield progress at scale, projecting global crop production, and planning effective policies. To date, however, crop type maps remain challenging to create in low and middle-income countries due to a lack of ground truth labels for training machine learning models. Field surveys are the gold standard in terms of accuracy but require an often-prohibitively large amount of time, money, and statistical capacity. In recent years, street-level imagery, such as Google Street View, KartaView, and Mapillary, has become available around the world. Such imagery contains rich information about crop types grown at particular locations and times. In this work, we develop an automated system to generate crop type ground references using deep learning and Google Street View imagery. The method efficiently curates a set of street view images containing crop fields, trains a model to predict crop type by utilizing weakly-labelled images from disparate out-of-domain sources, and combines predicted labels with remote sensing time series to create a wall-to-wall crop type map. We show that, in Thailand, the resulting country-wide map of rice, cassava, maize, and sugarcane achieves an accuracy of 93%. We publicly release the first-ever crop type map for all of Thailand 2022 at 10m-resolution with no gaps. To our knowledge, this is the first time a 10m-resolution, multi-crop map has been created for any smallholder country. As the availability of roadside imagery expands, our pipeline provides a way to map crop types at scale around the globe, especially in underserved smallholder regions.

Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types

TL;DR

This work tackles the shortage of crop-type maps in low- and middle-income regions by integrating street-view imagery with remote sensing. It introduces an automated pipeline that auto-generates field coordinates from OpenStreetMap, filters and orients street-view images toward crop fields, and uses weakly-labeled data (from WebCC, iNaturalist, and GPT-4V) to train street-view crop-type classifiers; these labels then train a Sentinel-2-based remote-sensing model to produce wall-to-wall crop-type maps. In Thailand, the approach yields a 10 m resolution map covering rice, cassava, maize, and sugarcane with 93% overall accuracy, alongside a public release of 81,000 ground-reference labels and the full 2022 crop map. The study demonstrates that weak supervision, including GPT-4V labeling, can substantially reduce expert labeling needs while maintaining high performance, offering a scalable path to crop mapping in underserved regions. Limitations include dependence on street-view data availability, update frequency, and domain transfer considerations, but the framework generalizes to other regions and street-level imagery sources.

Abstract

Accurate crop type maps are an essential source of information for monitoring yield progress at scale, projecting global crop production, and planning effective policies. To date, however, crop type maps remain challenging to create in low and middle-income countries due to a lack of ground truth labels for training machine learning models. Field surveys are the gold standard in terms of accuracy but require an often-prohibitively large amount of time, money, and statistical capacity. In recent years, street-level imagery, such as Google Street View, KartaView, and Mapillary, has become available around the world. Such imagery contains rich information about crop types grown at particular locations and times. In this work, we develop an automated system to generate crop type ground references using deep learning and Google Street View imagery. The method efficiently curates a set of street view images containing crop fields, trains a model to predict crop type by utilizing weakly-labelled images from disparate out-of-domain sources, and combines predicted labels with remote sensing time series to create a wall-to-wall crop type map. We show that, in Thailand, the resulting country-wide map of rice, cassava, maize, and sugarcane achieves an accuracy of 93%. We publicly release the first-ever crop type map for all of Thailand 2022 at 10m-resolution with no gaps. To our knowledge, this is the first time a 10m-resolution, multi-crop map has been created for any smallholder country. As the availability of roadside imagery expands, our pipeline provides a way to map crop types at scale around the globe, especially in underserved smallholder regions.
Paper Structure (29 sections, 8 equations, 9 figures, 6 tables)

This paper contains 29 sections, 8 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Top: Street-view images of roadside occlusions present between the car-mounted camera and fields. Bottom: Street-view images after the automated filtering process for the four major crop types in Thailand.
  • Figure 2: Overview of the methods presented in this paper to create a Thailand-wide crop type map. Example field points, ground reference labels, and crop type map are shown for the district of Sawang Ha.
  • Figure 3: Spatial and temporal distribution of Google Street View in Thailand. Left: Hexbin plot of GSV availability across Thailand. The zoomed-in panel shows the location of street-view images overlaid on a satellite basemap in the district of Sawaeng Ha. Right: Availability of street-view images in Thailand by month, with a clear rise in availability since 2022 and a total of over 3 million images. During the wet season (May--October) shown in the blue box, 1.5 million images are available.
  • Figure 4: Schematic of the process to generate ground reference labels. Each ground reference is composed of crop type and geocoordinates from street-view images.
  • Figure 5: Example images from the two online datasets. Although some images are of crop fields similar to the target street-view task, many images are either close-ups of a plant (especially in iNaturalist), a single plant instead of a field, or label noise (especially in WebCC).
  • ...and 4 more figures