Table of Contents
Fetching ...

Deep learning recognition and analysis of Volatile Organic Compounds based on experimental and synthetic infrared absorption spectra

Andrea Della Valle, Annalisa D'Arco, Tiziana Mancini, Rosanna Mosetti, Maria Chiara Paolozzi, Stefano Lupi, Sebastiano Pilati, Andrea Perali

TL;DR

This work tackles real-time VOC detection by combining experimental FTIR IR absorption spectra with synthetic spectra generated by a conditioned variational autoencoder. A CNN-based discriminative model identifies VOC class and predicts concentration, and its performance is boosted by data augmentation using both oversampling and CVAE-generated spectra. The CVAE effectively generates spectra conditioned on VOC class and concentration, and the augmented models show marked improvements in concentration prediction and class-identity accuracy, validated by saliency analysis and clustering visualizations. The authors provide a publicly available synthetic spectra repository to facilitate further development of VOC sensing devices.

Abstract

Volatile Organic Compounds (VOCs) are organic molecules that have low boiling points and therefore easily evaporate into the air. They pose significant risks to human health, making their accurate detection the crux of efforts to monitor and minimize exposure. Infrared (IR) spectroscopy enables the ultrasensitive detection at low-concentrations of VOCs in the atmosphere by measuring their IR absorption spectra. However, the complexity of the IR spectra limits the possibility to implement VOC recognition and quantification in real-time. While deep neural networks (NNs) are increasingly used for the recognition of complex data structures, they typically require massive datasets for the training phase. Here, we create an experimental VOC dataset for nine different classes of compounds at various concentrations, using their IR absorption spectra. To further increase the amount of spectra and their diversity in term of VOC concentration, we augment the experimental dataset with synthetic spectra created via conditional generative NNs. This allows us to train robust discriminative NNs, able to reliably identify the nine VOCs, as well as to precisely predict their concentrations. The trained NN is suitable to be incorporated into sensing devices for VOCs recognition and analysis.

Deep learning recognition and analysis of Volatile Organic Compounds based on experimental and synthetic infrared absorption spectra

TL;DR

This work tackles real-time VOC detection by combining experimental FTIR IR absorption spectra with synthetic spectra generated by a conditioned variational autoencoder. A CNN-based discriminative model identifies VOC class and predicts concentration, and its performance is boosted by data augmentation using both oversampling and CVAE-generated spectra. The CVAE effectively generates spectra conditioned on VOC class and concentration, and the augmented models show marked improvements in concentration prediction and class-identity accuracy, validated by saliency analysis and clustering visualizations. The authors provide a publicly available synthetic spectra repository to facilitate further development of VOC sensing devices.

Abstract

Volatile Organic Compounds (VOCs) are organic molecules that have low boiling points and therefore easily evaporate into the air. They pose significant risks to human health, making their accurate detection the crux of efforts to monitor and minimize exposure. Infrared (IR) spectroscopy enables the ultrasensitive detection at low-concentrations of VOCs in the atmosphere by measuring their IR absorption spectra. However, the complexity of the IR spectra limits the possibility to implement VOC recognition and quantification in real-time. While deep neural networks (NNs) are increasingly used for the recognition of complex data structures, they typically require massive datasets for the training phase. Here, we create an experimental VOC dataset for nine different classes of compounds at various concentrations, using their IR absorption spectra. To further increase the amount of spectra and their diversity in term of VOC concentration, we augment the experimental dataset with synthetic spectra created via conditional generative NNs. This allows us to train robust discriminative NNs, able to reliably identify the nine VOCs, as well as to precisely predict their concentrations. The trained NN is suitable to be incorporated into sensing devices for VOCs recognition and analysis.

Paper Structure

This paper contains 15 sections, 3 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Schematic view of the setup used for collecting VOCs absorption spectra. Liquid VOCs are added in varying amounts to the evaporation chamber using a micropipette. Gaseous VOCs concentrations are continuously monitored via PID sensor controlled by LabVIEW$^{TM}$. Once the PID signal stabilizes, indicating vapor-liquid equilibrium, the valve to the pre-evacuated multipass spectrometer cell is opened. The cell, sealed with KBr windows and evacuated to a few mbar using an Edwards T-Station 85, allows rapid gas transfer. Spectral acquisition begins immediately upon valve opening, with PID readings recorded simultaneously.
  • Figure 2: Experimental dataset composition, divided by class and concentration (in brackets the number of IR spectra for each condition). The dataset is composed of $1253$ IR spectra.
  • Figure 3: Architecture of the CNN used to predict the VOC class and concentration. The sequence of one-dimensional convolutional layers is followed by two separated MLPs: the classifier head and the regression head.
  • Figure 4: Architecture of the CVAE. It is composed on two main blocks: conditional encoder and decoder. After the training, the conditional decoder is used to generate spectra with desired conditions.
  • Figure 5: Training procedure for enhanced discriminative NNs.
  • ...and 6 more figures