Table of Contents
Fetching ...

Seeing Soil from Space: Towards Robust and Scalable Remote Soil Nutrient Analysis

David Seu, Nicolas Longepe, Gabriel Cioltea, Erik Maidik, Calin Andrei

TL;DR

This work tackles the challenge of scalable, field-relevant soil nutrient estimation across European croplands using remote sensing and environmental covariates. It advances a hybrid modeling paradigm that fuses physics-informed features from radiative transfer models, deep learning embeddings from a foundation model, and traditional covariates within a SCORPAN-inspired digital soil mapping framework. A rigorous evaluation protocol—employing spatial blocking, AEZ stratification, and conformal uncertainty—delivers robust estimates for SOC, N, P, K, and pH, with SOC and N achieving the strongest accuracy (CCC ≈ 0.77; SOC MAE ≈ 5.1 g/kg; N MAE ≈ 0.4–0.44 g/kg). The framework demonstrates scalable, transparent soil mapping suitable for agronomic decision support and potential extensions to carbon markets, while identifying the need for more data in underrepresented AEZs and deeper horizons to improve P and K predictions.

Abstract

Environmental variables are increasingly affecting agricultural decision-making, yet accessible and scalable tools for soil assessment remain limited. This study presents a robust and scalable modeling system for estimating soil properties in croplands, including soil organic carbon (SOC), total nitrogen (N), available phosphorus (P), exchangeable potassium (K), and pH, using remote sensing data and environmental covariates. The system employs a hybrid modeling approach, combining the indirect methods of modeling soil through proxies and drivers with direct spectral modeling. We extend current approaches by using interpretable physics-informed covariates derived from radiative transfer models (RTMs) and complex, nonlinear embeddings from a foundation model. We validate the system on a harmonized dataset that covers Europes cropland soils across diverse pedoclimatic zones. Evaluation is conducted under a robust validation framework that enforces strict spatial blocking, stratified splits, and statistically distinct train-test sets, which deliberately make the evaluation harder and produce more realistic error estimates for unseen regions. The models achieved their highest accuracy for SOC and N. This performance held across unseen locations, under both spatial cross-validation and an independent test set. SOC obtained a MAE of 5.12 g/kg and a CCC of 0.77, and N obtained a MAE of 0.44 g/kg and a CCC of 0.77. We also assess uncertainty through conformal calibration, achieving 90 percent coverage at the target confidence level. This study contributes to the digital advancement of agriculture through the application of scalable, data-driven soil analysis frameworks that can be extended to related domains requiring quantitative soil evaluation, such as carbon markets.

Seeing Soil from Space: Towards Robust and Scalable Remote Soil Nutrient Analysis

TL;DR

This work tackles the challenge of scalable, field-relevant soil nutrient estimation across European croplands using remote sensing and environmental covariates. It advances a hybrid modeling paradigm that fuses physics-informed features from radiative transfer models, deep learning embeddings from a foundation model, and traditional covariates within a SCORPAN-inspired digital soil mapping framework. A rigorous evaluation protocol—employing spatial blocking, AEZ stratification, and conformal uncertainty—delivers robust estimates for SOC, N, P, K, and pH, with SOC and N achieving the strongest accuracy (CCC ≈ 0.77; SOC MAE ≈ 5.1 g/kg; N MAE ≈ 0.4–0.44 g/kg). The framework demonstrates scalable, transparent soil mapping suitable for agronomic decision support and potential extensions to carbon markets, while identifying the need for more data in underrepresented AEZs and deeper horizons to improve P and K predictions.

Abstract

Environmental variables are increasingly affecting agricultural decision-making, yet accessible and scalable tools for soil assessment remain limited. This study presents a robust and scalable modeling system for estimating soil properties in croplands, including soil organic carbon (SOC), total nitrogen (N), available phosphorus (P), exchangeable potassium (K), and pH, using remote sensing data and environmental covariates. The system employs a hybrid modeling approach, combining the indirect methods of modeling soil through proxies and drivers with direct spectral modeling. We extend current approaches by using interpretable physics-informed covariates derived from radiative transfer models (RTMs) and complex, nonlinear embeddings from a foundation model. We validate the system on a harmonized dataset that covers Europes cropland soils across diverse pedoclimatic zones. Evaluation is conducted under a robust validation framework that enforces strict spatial blocking, stratified splits, and statistically distinct train-test sets, which deliberately make the evaluation harder and produce more realistic error estimates for unseen regions. The models achieved their highest accuracy for SOC and N. This performance held across unseen locations, under both spatial cross-validation and an independent test set. SOC obtained a MAE of 5.12 g/kg and a CCC of 0.77, and N obtained a MAE of 0.44 g/kg and a CCC of 0.77. We also assess uncertainty through conformal calibration, achieving 90 percent coverage at the target confidence level. This study contributes to the digital advancement of agriculture through the application of scalable, data-driven soil analysis frameworks that can be extended to related domains requiring quantitative soil evaluation, such as carbon markets.

Paper Structure

This paper contains 24 sections, 9 equations, 13 figures, 12 tables.

Figures (13)

  • Figure 1: Histograms of SOC, N, P, K, and pH across the harmonized dataset.
  • Figure 2: Spatial distribution of soil samples included in this study.
  • Figure 3: Temporal distribution of soil samples by year of collection.
  • Figure 4: Spatial distribution of valid HLS observations across soil sampling sites after cloud masking. Colors represent the percentage of available monthly reflectance values in the two years prior to sampling. Coverage is heterogeneous across Europe, necessitating fusion and gap-filling to avoid geographic biases in temporal descriptors.
  • Figure 5: Median and interquartile range of Normalized Difference Vegetation Index (NDVI), Normalized Difference Tillage Index (NDTI), and total precipitation (TP) 24 months before sampling, stratified by AEZ. Temperate systems show regular annual cycles, while subtropical and irrigated systems exhibit water-driven lags and extended greenness. These dynamics represent indirect proxies of nutrient cycling processes.
  • ...and 8 more figures