Table of Contents
Fetching ...

Deep Learning Domain Adaptation to Understand Physico-Chemical Processes from Fluorescence Spectroscopy Small Datasets: Application to Ageing of Olive Oil

Umberto Michelucci, Francesca Venturini

TL;DR

A significantly innovative approach in the use of deep learning for spectroscopy is described, transforming it from a black box into a tool for understanding complex biological and chemical processes.

Abstract

Fluorescence spectroscopy is a fundamental tool in life sciences and chemistry, widely used for applications such as environmental monitoring, food quality control, and biomedical diagnostics. However, analysis of spectroscopic data with deep learning, in particular of fluorescence excitation-emission matrices (EEMs), presents significant challenges due to the typically small and sparse datasets available. Furthermore, the analysis of EEMs is difficult due to their high dimensionality and overlapping spectral features. This study proposes a new approach that exploits domain adaptation with pretrained vision models, alongside a novel interpretability algorithm to address these challenges. Thanks to specialised feature engineering of the neural networks described in this work, we are now able to provide deeper insights into the physico-chemical processes underlying the data. The proposed approach is demonstrated through the analysis of the oxidation process in extra virgin olive oil (EVOO) during ageing, showing its effectiveness in predicting quality indicators and identifying the spectral bands, and thus the molecules involved in the process. This work describes a significantly innovative approach in the use of deep learning for spectroscopy, transforming it from a black box into a tool for understanding complex biological and chemical processes.

Deep Learning Domain Adaptation to Understand Physico-Chemical Processes from Fluorescence Spectroscopy Small Datasets: Application to Ageing of Olive Oil

TL;DR

A significantly innovative approach in the use of deep learning for spectroscopy is described, transforming it from a black box into a tool for understanding complex biological and chemical processes.

Abstract

Fluorescence spectroscopy is a fundamental tool in life sciences and chemistry, widely used for applications such as environmental monitoring, food quality control, and biomedical diagnostics. However, analysis of spectroscopic data with deep learning, in particular of fluorescence excitation-emission matrices (EEMs), presents significant challenges due to the typically small and sparse datasets available. Furthermore, the analysis of EEMs is difficult due to their high dimensionality and overlapping spectral features. This study proposes a new approach that exploits domain adaptation with pretrained vision models, alongside a novel interpretability algorithm to address these challenges. Thanks to specialised feature engineering of the neural networks described in this work, we are now able to provide deeper insights into the physico-chemical processes underlying the data. The proposed approach is demonstrated through the analysis of the oxidation process in extra virgin olive oil (EVOO) during ageing, showing its effectiveness in predicting quality indicators and identifying the spectral bands, and thus the molecules involved in the process. This work describes a significantly innovative approach in the use of deep learning for spectroscopy, transforming it from a black box into a tool for understanding complex biological and chemical processes.
Paper Structure (9 sections, 7 figures)

This paper contains 9 sections, 7 figures.

Figures (7)

  • Figure 1: Overview of the phases of the machine learning approach. a, The data preprocessing phase consists of splitting the data set for the LOO approach, normalisation of pixel values, and preparation for the MobileNetv2 network input layer by reshaping and creating the three necessary layers. b, The transfer-learning and fine-tuning phases allow the network to learn relevant features and create an internal representation of the physico-chemical models that can then be used for the interpretation. c, Information Elimination Approach (IEA) process diagram. doi references indicate the papers that describe some of the used components.
  • Figure 2: Comparison of the true (blue) and predicted (red) values of the quality indicator $K_{232}$ for all the oils at all oxidation stages (vertical scale on the left axis). The corresponding absolute error (AE) is shown as an area in yellow (vertical scale on the right axis). The Mean Absolute Error (MAE) obtained as average over all the oxidation stage is displayed in each panel for each oil.
  • Figure 3: Comparison of the true (blue) and predicted (red) values of the quality indicator $K_{268}$ for all the oils at all oxidation stage (vertical scale on the left axis). The corresponding absolute error (AE) is shown as an area in yellow (vertical scale on the right axis). The Mean Absolute Error (MAE) obtained as average over all the oxidation stage is displayed in each panel for each oil.
  • Figure 4: a, Comparison of predicted and measured (actual) values of the quality indicators $K_{232}$ and $K_{268}$ for all oils at all oxidation stages. The gray area in each plot marks the limit set by the Food and Agriculture Organisation of the United Nations and by the European Union. Oil C is marked in blue as the $K_{232}$ value was was already above this limit at the beginning of the study and, therefore, is not well predicted by the model. b, Violin plots of the AE for each oil for $K_{232}$ (above) and $K_{268}$ (below). The dashed lines indicate the 3$\sigma$ statistically estimated experimental error.
  • Figure 5: Average of the heatmaps obtained for all oils in the last oxidation stage showing the spectral band of relevance for the prediction of the $K_{232}$ and $K_{268}$. R1 marks the absorption and emission bands of chlorophylls, R2 those of oxidation products.
  • ...and 2 more figures