Spatial Data Science Languages: commonalities and needs
Edzer Pebesma, Martin Fleischmann, Josiah Parry, Jakub Nowosad, Anita Graser, Dewey Dunnington, Maarten Pronk, Rafael Schouten, Robin Lovelace, Marius Appel, Lorena Abad
TL;DR
This paper investigates how spatial data science operates across R, Python, and Julia, focusing on coordinating workflows for spatial and spatio-temporal data. It synthesizes insights from the Spatial Data Science Languages workshops, identifying challenges in file formats, geodetic handling, data cubes, and cross-language development, and proposes open standards, field-domain alignment, and community practices. The authors compare GIS and modelling conventions, discuss trajectories and moving features, and describe cross-language infrastructure as an emerging need. The findings aim to improve interoperability, reproducibility, and collaboration across language ecosystems in spatial data science.
Abstract
Recent workshops brought together several developers, educators and users of software packages extending popular languages for spatial data handling, with a primary focus on R, Python and Julia. Common challenges discussed included handling of spatial or spatio-temporal support, geodetic coordinates, in-memory vector data formats, data cubes, inter-package dependencies, packaging upstream libraries, differences in habits or conventions between the GIS and physical modelling communities, and statistical models. The following set of insights have been formulated: (i) considering software problems across data science language silos helps to understand and standardise analysis approaches, also outside the domain of formal standardisation bodies; (ii) whether attribute variables have block or point support, and whether they are spatially intensive or extensive has consequences for permitted operations, and hence for software implementing those; (iii) handling geometries on the sphere rather than on the flat plane requires modifications to the logic of {\em simple features}, (iv) managing communities and fostering diversity is a necessary, on-going effort, and (v) tools for cross-language development need more attention and support.
