Reevaluating Convolutional Neural Networks for Spectral Analysis: A Focus on Raman Spectroscopy
Deniz Soysal, Xabier García-Andrade, Laura E. Rodriguez, Pablo Sobron, Laura M. Barge, Renaud Detry
TL;DR
This work reframes CNN-based Raman spectroscopy classification for autonomous, resource-constrained missions by training on raw spectra and carefully controlling translational invariance. It demonstrates four practical advances: (i) baseline-free accuracy, where 1-D CNNs outperform KNN/SVM on raw data without background correction; (ii) pooling-based robustness that tunes invariance to Raman shifts up to $30\, \mathrm{cm^{-1}}$; (iii) label-efficient learning via SGAN and contrastive pretraining that gains up to $11\%$ with only $10\%$ labels; and (iv) constant-time adaptation by freezing backbones and retraining only the softmax layer for new minerals, outperforming Siamese approaches on embedded hardware. The study also provides in-depth analysis of CNN inductive biases for spectra, interpretable Grad-CAM insights, and transfer-learning limitations, supported by reproducible data splits from the RRUFF database. Collectively, these findings offer a scalable, data-efficient pathway for robust Raman classification in autonomous planetary and oceanic exploration contexts. The open-release of datasets and scripts further enables benchmarking and deployment-ready evaluation.
Abstract
Autonomous Raman instruments on Mars rovers, deep-sea landers, and field robots must interpret raw spectra distorted by fluorescence baselines, peak shifts, and limited ground-truth labels. Using curated subsets of the RRUFF database, we evaluate one-dimensional convolutional neural networks (CNNs) and report four advances: (i) Baseline-independent classification: compact CNNs surpass $k$-nearest-neighbors and support-vector machines on handcrafted features, removing background-correction and peak-picking stages while ensuring reproducibility through released data splits and scripts. (ii) Pooling-controlled robustness: tuning a single pooling parameter accommodates Raman shifts up to $30 \,\mathrm{cm}^{-1}$, balancing translational invariance with spectral resolution. (iii) Label-efficient learning: semi-supervised generative adversarial networks and contrastive pretraining raise accuracy by up to $11\%$ with only $10\%$ labels, valuable for autonomous deployments with scarce annotation. (iv) Constant-time adaptation: freezing the CNN backbone and retraining only the softmax layer transfers models to unseen minerals at $\mathcal{O}(1)$ cost, outperforming Siamese networks on resource-limited processors. This workflow, which involves training on raw spectra, tuning pooling, adding semi-supervision when labels are scarce, and fine-tuning lightly for new targets, provides a practical path toward robust, low-footprint Raman classification in autonomous exploration.
