Causal machine learning for sustainable agroecosystems
Vasileios Sitokonstantinou, Emiliano Díaz Salas Porras, Jordi Cerdà Bautista, Maria Piles, Ioannis Athanasiadis, Hannah Kerner, Giulia Martini, Lily-belle Sweet, Ilias Tsoumas, Jakob Zscheischler, Gustau Camps-Valls
TL;DR
The paper argues that sustainable agroecosystem decision-making requires causal understanding beyond traditional predictive ML. It proposes a causal ML framework that integrates ML with causal reasoning, enabling both answering causal questions (causal discovery and effect estimation) and improving predictions through causality-aware modeling. Eight applications across science, policy, farming, and predictive modeling illustrate methods such as Invariant Causal Prediction ($ICP$), estimates of $ATE$ and $CATE$, the $X$-learner, and Double Machine Learning ($DML$), including crop-growth model intercomparison and evaluation of digital agriculture tools. A practical workflow—define causal questions, curate data, state assumptions, select methods, and validate robustness—is discussed alongside data limitations and benchmark needs. Overall, the work aims to provide evidence-based decisions that enhance sustainability and food security by quantifying intervention impacts and improving model transferability under changing conditions.
Abstract
In a changing climate, sustainable agriculture is essential for food security and environmental health. However, it is challenging to understand the complex interactions among its biophysical, social, and economic components. Predictive machine learning (ML), with its capacity to learn from data, is leveraged in sustainable agriculture for applications like yield prediction and weather forecasting. Nevertheless, it cannot explain causal mechanisms and remains descriptive rather than prescriptive. To address this gap, we propose causal ML, which merges ML's data processing with causality's ability to reason about change. This facilitates quantifying intervention impacts for evidence-based decision-making and enhances predictive model robustness. We showcase causal ML through eight diverse applications that benefit stakeholders across the agri-food chain, including farmers, policymakers, and researchers.
