Deep learning recognition and analysis of Volatile Organic Compounds based on experimental and synthetic infrared absorption spectra
Andrea Della Valle, Annalisa D'Arco, Tiziana Mancini, Rosanna Mosetti, Maria Chiara Paolozzi, Stefano Lupi, Sebastiano Pilati, Andrea Perali
TL;DR
This work tackles real-time VOC detection by combining experimental FTIR IR absorption spectra with synthetic spectra generated by a conditioned variational autoencoder. A CNN-based discriminative model identifies VOC class and predicts concentration, and its performance is boosted by data augmentation using both oversampling and CVAE-generated spectra. The CVAE effectively generates spectra conditioned on VOC class and concentration, and the augmented models show marked improvements in concentration prediction and class-identity accuracy, validated by saliency analysis and clustering visualizations. The authors provide a publicly available synthetic spectra repository to facilitate further development of VOC sensing devices.
Abstract
Volatile Organic Compounds (VOCs) are organic molecules that have low boiling points and therefore easily evaporate into the air. They pose significant risks to human health, making their accurate detection the crux of efforts to monitor and minimize exposure. Infrared (IR) spectroscopy enables the ultrasensitive detection at low-concentrations of VOCs in the atmosphere by measuring their IR absorption spectra. However, the complexity of the IR spectra limits the possibility to implement VOC recognition and quantification in real-time. While deep neural networks (NNs) are increasingly used for the recognition of complex data structures, they typically require massive datasets for the training phase. Here, we create an experimental VOC dataset for nine different classes of compounds at various concentrations, using their IR absorption spectra. To further increase the amount of spectra and their diversity in term of VOC concentration, we augment the experimental dataset with synthetic spectra created via conditional generative NNs. This allows us to train robust discriminative NNs, able to reliably identify the nine VOCs, as well as to precisely predict their concentrations. The trained NN is suitable to be incorporated into sensing devices for VOCs recognition and analysis.
