Table of Contents
Fetching ...

Predicting Soil Macronutrient Levels: A Machine Learning Approach Models Trained on pH, Conductivity, and Average Power of Acid-Base Solutions

Mridul Kumar, Deepali Jain, Zeeshan Saifi, Soami Daya Krishnananda

TL;DR

This study tackles the challenge of real-time soil macronutrient monitoring by training ML regressors on a synthetic acid–base dataset where inputs are pH, conductivity, and average power, and outputs are concentrations of N-related acids and P/K bases. Random forest and neural network models emerge as the most accurate predictors, achieving notable prediction errors for P2O5 and K2O when validated against lab measurements. The approach offers a cost-effective, real-time alternative to conventional soil testing, with room for enhancement through additional features, larger datasets, and broader nutrient coverage. The work highlights practical potential for on-site nutrient assessment and precise fertilizer management, while acknowledging limitations in translating acid-base predictions to standard soil nutrient metrics and the need for further calibration.

Abstract

Soil macronutrients, particularly potassium ions (K$^+$), are indispensable for plant health, underpinning various physiological and biological processes, and facilitating the management of both biotic and abiotic stresses. Deficient macronutrient content results in stunted growth, delayed maturation, and increased vulnerability to environmental stressors, thereby accentuating the imperative for precise soil nutrient monitoring. Traditional techniques such as chemical assays, atomic absorption spectroscopy, inductively coupled plasma optical emission spectroscopy, and electrochemical methods, albeit advanced, are prohibitively expensive and time-intensive, thus unsuitable for real-time macronutrient assessment. In this study, we propose an innovative soil testing protocol utilizing a dataset derived from synthetic solutions to model soil behaviour. The dataset encompasses physical properties including conductivity and pH, with a concentration on three key macronutrients: nitrogen (N), phosphorus (P), and potassium (K). Four machine learning algorithms were applied to the dataset, with random forest regressors and neural networks being selected for the prediction of soil nutrient concentrations. Comparative analysis with laboratory soil testing results revealed prediction errors of 23.6% for phosphorus and 16% for potassium using the random forest model, and 26.3% for phosphorus and 21.8% for potassium using the neural network model. This methodology illustrates a cost-effective and efficacious strategy for real-time soil nutrient monitoring, offering substantial advancements over conventional techniques and enhancing the capability to sustain optimal nutrient levels conducive to robust crop growth.

Predicting Soil Macronutrient Levels: A Machine Learning Approach Models Trained on pH, Conductivity, and Average Power of Acid-Base Solutions

TL;DR

This study tackles the challenge of real-time soil macronutrient monitoring by training ML regressors on a synthetic acid–base dataset where inputs are pH, conductivity, and average power, and outputs are concentrations of N-related acids and P/K bases. Random forest and neural network models emerge as the most accurate predictors, achieving notable prediction errors for P2O5 and K2O when validated against lab measurements. The approach offers a cost-effective, real-time alternative to conventional soil testing, with room for enhancement through additional features, larger datasets, and broader nutrient coverage. The work highlights practical potential for on-site nutrient assessment and precise fertilizer management, while acknowledging limitations in translating acid-base predictions to standard soil nutrient metrics and the need for further calibration.

Abstract

Soil macronutrients, particularly potassium ions (K), are indispensable for plant health, underpinning various physiological and biological processes, and facilitating the management of both biotic and abiotic stresses. Deficient macronutrient content results in stunted growth, delayed maturation, and increased vulnerability to environmental stressors, thereby accentuating the imperative for precise soil nutrient monitoring. Traditional techniques such as chemical assays, atomic absorption spectroscopy, inductively coupled plasma optical emission spectroscopy, and electrochemical methods, albeit advanced, are prohibitively expensive and time-intensive, thus unsuitable for real-time macronutrient assessment. In this study, we propose an innovative soil testing protocol utilizing a dataset derived from synthetic solutions to model soil behaviour. The dataset encompasses physical properties including conductivity and pH, with a concentration on three key macronutrients: nitrogen (N), phosphorus (P), and potassium (K). Four machine learning algorithms were applied to the dataset, with random forest regressors and neural networks being selected for the prediction of soil nutrient concentrations. Comparative analysis with laboratory soil testing results revealed prediction errors of 23.6% for phosphorus and 16% for potassium using the random forest model, and 26.3% for phosphorus and 21.8% for potassium using the neural network model. This methodology illustrates a cost-effective and efficacious strategy for real-time soil nutrient monitoring, offering substantial advancements over conventional techniques and enhancing the capability to sustain optimal nutrient levels conducive to robust crop growth.

Paper Structure

This paper contains 23 sections, 21 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: This figure shows the preparation of the experimental setup. A. Top view of the beaker in which characteristics of final solution is measured. B. Side view of the beaker with the final volume of the solution. C. V-I characteristics of the final solution, the shaded area shows the average power transferred through the solution. D. The experimental setup for measuring the pH and calculating the V-I characteristics of the acid-base solutions.
  • Figure 2: This figure illustrates the Spearman correlation coefficient (SCC) among the dataset's features, including acid-base concentration, pH, P$_{av}$, and conductivity. The elements along the diagonal (from the top-left to the bottom-right) represent the self-correlation of each feature and are consequently non-informative for the analysis..
  • Figure 3: MAE loss on different kfold set for all the algorithms used. a. Linear model without PCA. b. Linear model with PCA. c. Random forest without PCA. d. Random forest with PCA. e. k-NN regressor without PCA. f. k-NN regressor with PCA.
  • Figure 4: Variation of loss with epochs for different 5-fold sets used as training and validation sets.
  • Figure 5: Comparison of mean absolute error between different activation functions with and without the application of principal component analysis.
  • ...and 1 more figures