Table of Contents
Fetching ...

Water Quality Estimation Through Machine Learning Multivariate Analysis

Marco Cardia, Stefano Chessa, Alessio Micheli, Antonella Giuliana Luminare, Francesca Gambineri

TL;DR

This work tackles the need for fast, in situ water quality assessment in agriculture by integrating UV-Vis spectroscopy with multitarget regression via an MLP, enhanced by PCA for dimensionality reduction and SHAP for interpretability. Using real-world Tuscany water data, the authors demonstrate robust, multitarget predictions for TOC and key ions, with high R2-values and acceptable error metrics. The approach offers a scalable, cost-effective soft sensor for real-time monitoring and suggests spectral optimization guided by interpretability analyses. The study lays groundwork for broader spectral modalities and transfer learning to extend applicability across regions and water types.

Abstract

The quality of water is key for the quality of agrifood sector. Water is used in agriculture for fertigation, for animal husbandry, and in the agrifood processing industry. In the context of the progressive digitalization of this sector, the automatic assessment of the quality of water is thus becoming an important asset. In this work, we present the integration of Ultraviolet-Visible (UV-Vis) spectroscopy with Machine Learning in the context of water quality assessment aiming at ensuring water safety and the compliance of water regulation. Furthermore, we emphasize the importance of model interpretability by employing SHapley Additive exPlanations (SHAP) to understand the contribution of absorbance at different wavelengths to the predictions. Our approach demonstrates the potential for rapid, accurate, and interpretable assessment of key water quality parameters.

Water Quality Estimation Through Machine Learning Multivariate Analysis

TL;DR

This work tackles the need for fast, in situ water quality assessment in agriculture by integrating UV-Vis spectroscopy with multitarget regression via an MLP, enhanced by PCA for dimensionality reduction and SHAP for interpretability. Using real-world Tuscany water data, the authors demonstrate robust, multitarget predictions for TOC and key ions, with high R2-values and acceptable error metrics. The approach offers a scalable, cost-effective soft sensor for real-time monitoring and suggests spectral optimization guided by interpretability analyses. The study lays groundwork for broader spectral modalities and transfer learning to extend applicability across regions and water types.

Abstract

The quality of water is key for the quality of agrifood sector. Water is used in agriculture for fertigation, for animal husbandry, and in the agrifood processing industry. In the context of the progressive digitalization of this sector, the automatic assessment of the quality of water is thus becoming an important asset. In this work, we present the integration of Ultraviolet-Visible (UV-Vis) spectroscopy with Machine Learning in the context of water quality assessment aiming at ensuring water safety and the compliance of water regulation. Furthermore, we emphasize the importance of model interpretability by employing SHapley Additive exPlanations (SHAP) to understand the contribution of absorbance at different wavelengths to the predictions. Our approach demonstrates the potential for rapid, accurate, and interpretable assessment of key water quality parameters.

Paper Structure

This paper contains 7 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Absorption spectrum of the sample with the median value of the absorption spectra.
  • Figure 2: TOC predicted vs actual values scatter plot.
  • Figure 3: Conductivity predicted vs actual values scatter plot.
  • Figure 4: SHAP values for the MLP model showing the contribution of each wavelength to the prediction of TOC.