Soil analysis with machine-learning-based processing of stepped-frequency GPR field measurements: Preliminary study
Chunlei Xu, Michael Pregesbauer, Naga Sravani Chilukuri, Daniel Windhager, Mahsa Yousefi, Pedro Julian, Lothar Ratschbacher
TL;DR
The paper investigates end-to-end ML processing of high-resolution air-coupled SFCW GPR data to predict EMI-derived apparent conductivity (ECaR) for soil analysis in precision agriculture. It reports a large field campaign on a golf course with 3472 co-registered samples over roughly 6600 m^2, using EMI as a proxy ground truth. Among regression models, Random Forest consistently performs best under spatially dense data, with a notable Pearson correlation of $r=0.425$ in the best case, and the nugget-to-sill ratio NSR correlating with all performance metrics. The study highlights NSR as a practical, ground-truth-free metric for model assessment in field surveys and recommends expanding multi-sensor data fusion and remote sensing to improve generalization for precision agriculture.
Abstract
Ground Penetrating Radar (GPR) has been widely studied as a tool for extracting soil parameters relevant to agriculture and horticulture. When combined with Machine Learning (ML) methods, air-coupled Stepped Frequency Continuous Wave Ground Penetrating Radar (SFCW GPR) measurements could offer a cost-effective way to obtain depth-resolved soil data. As a first step of our study in this direction, we conducted an extensive field survey using a tractor-mounted air-coupled SFCW GPR instrument. Leveraging ML-based data processing, we evaluate the GPR instrument's ability by predicting the apparent electrical conductivity (ECaR) measured by a co-recorded Electromagnetic Induction (EMI) instrument. The large-scale field measurement campaign with 3472 co-registered and geo-located GPR and EMI data samples distributed over approximately 6600 square meters was performed on a golf course. This terrain offers high surface homogeneity but also presents the challenge of subtle soil parameter variations. Based on the results, we discuss challenges in this multi-sensor regression setting and propose the use of the nugget-to-sill ratio as a performance metric for evaluating ML models in agricultural field survey applications.
