Geospatial Machine Learning Libraries
Adam J. Stewart, Caleb Robinson, Arindam Banerjee
TL;DR
This chapter surveys the geospatial machine learning (GeoML) library landscape, detailing how domain-specific tools address geospatial data challenges that standard ML pipelines struggle with. It analyzes the evolution of GeoML tooling, with in-depth looks at TorchGeo, eo-learn, and Raster Vision, and demonstrates practical workflows through a crop-type mapping case study. The discussion covers data formats, benchmarking, licensing, CI, and governance, highlighting both progress and persistent bottlenecks in reproducibility and scalability. Looking ahead, the chapter emphasizes the emergence of foundation models and embeddings, reuse and ease-of-use improvements, and independent governance as shaping forces for a more interoperable and sustainable GeoML ecosystem.
Abstract
Recent advances in machine learning have been supported by the emergence of domain-specific software libraries, enabling streamlined workflows and increased reproducibility. For geospatial machine learning (GeoML), the availability of Earth observation data has outpaced the development of domain libraries to handle its unique challenges, such as varying spatial resolutions, spectral properties, temporal cadence, data coverage, coordinate systems, and file formats. This chapter presents a comprehensive overview of GeoML libraries, analyzing their evolution, core functionalities, and the current ecosystem. It also introduces popular GeoML libraries such as TorchGeo, eo-learn, and Raster Vision, detailing their architecture, supported data types, and integration with ML frameworks. Additionally, it discusses common methodologies for data preprocessing, spatial--temporal joins, benchmarking, and the use of pretrained models. Through a case study in crop type mapping, it demonstrates practical applications of these tools. Best practices in software design, licensing, and testing are highlighted, along with open challenges and future directions, particularly the rise of foundation models and the need for governance in open-source geospatial software. Our aim is to guide practitioners, developers, and researchers in navigating and contributing to the rapidly evolving GeoML landscape.
