Table of Contents
Fetching ...

Planetary computing for data-driven environmental policy-making

Patrick Ferris, Michael Dales, Sadiq Jaffer, Amelia Holcomb, Eleanor Toye Scott, Thomas Swinfield, Alison Eyres, Andrew Balmford, David Coomes, Srinivasan Keshav, Anil Madhavapeddy

TL;DR

The paper addresses the data-driven policy challenge in environmental science by proposing planetary computing: a durable, scalable infrastructure for ingesting, transforming, analyzing, and publishing global environmental data. It surveys existing end-to-end platforms (e.g., GEE, MPC) and component frameworks, identifies gaps in reproducibility, traceability, and privacy, and articulates capabilities, agency, and survivability requirements for a sustainable ecosystem. Core contributions include a structured view of the data lifecycle, explicit discussion of uncertainties across data, code, dependencies, and policy, and a roadmap of open research challenges—privacy vs. transparency, cross-versioning of data and code, and user-friendly interfaces for non-experts. The work emphasizes long-term resilience and open governance to enable trustworthy, transparent, and auditable environmental policy-making with scalable geospatial data at planetary scale.

Abstract

We make a case for "planetary computing" -- infrastructure to handle the ingestion, transformation, analysis and publication of global data products for furthering environmental science and enabling better informed policy-making. We draw on our experiences as a team of computer scientists working with environmental scientists on forest carbon and biodiversity preservation, and classify existing solutions by their flexibility in scalably processing geospatial data, and also how well they support building trust in the results via traceability and reproducibility. We identify research gaps in the intersection of computing and environmental science around how to handle continuously changing datasets that are often collected across decades and require careful access control rather than being fully open access.

Planetary computing for data-driven environmental policy-making

TL;DR

The paper addresses the data-driven policy challenge in environmental science by proposing planetary computing: a durable, scalable infrastructure for ingesting, transforming, analyzing, and publishing global environmental data. It surveys existing end-to-end platforms (e.g., GEE, MPC) and component frameworks, identifies gaps in reproducibility, traceability, and privacy, and articulates capabilities, agency, and survivability requirements for a sustainable ecosystem. Core contributions include a structured view of the data lifecycle, explicit discussion of uncertainties across data, code, dependencies, and policy, and a roadmap of open research challenges—privacy vs. transparency, cross-versioning of data and code, and user-friendly interfaces for non-experts. The work emphasizes long-term resilience and open governance to enable trustworthy, transparent, and auditable environmental policy-making with scalable geospatial data at planetary scale.

Abstract

We make a case for "planetary computing" -- infrastructure to handle the ingestion, transformation, analysis and publication of global data products for furthering environmental science and enabling better informed policy-making. We draw on our experiences as a team of computer scientists working with environmental scientists on forest carbon and biodiversity preservation, and classify existing solutions by their flexibility in scalably processing geospatial data, and also how well they support building trust in the results via traceability and reproducibility. We identify research gaps in the intersection of computing and environmental science around how to handle continuously changing datasets that are often collected across decades and require careful access control rather than being fully open access.
Paper Structure (26 sections, 6 figures, 1 table)

This paper contains 26 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Two versions of the JRC TMF dataset showing the same land use class data for an area of Indonesia in 2008. The left image shows the 2021 data and the image on the right shows 2022 data. ■ undisturbed, ■ degraded, ■ deforested, ■ regrowth, ■ water, ■ other
  • Figure 2: Showing the difference in terrain ruggedness index (TRI) as calculated by the gdaldem tri command between GDAL versions 3.2 and 3.3. The difference would ideally be zero (all black).
  • Figure 3: An illustrative pipeline for two of the motives in §\ref{['s:motive']} and the policy scenarios that could be derived from the results. For a fuller specification see tmfv2 or eyres_life_2023.
  • Figure 4: Ideal dataflow pipeline for a planetary computing engine
  • Figure 5: The pipeline from Figure \ref{['fig:pos-before']}, but highlighting computing research challenges around privacy, versioning and longevity. The computing and environmental challenges depend on each other for a trustworthy policy outcome.
  • ...and 1 more figures