Unmasking Airborne Threats: Guided-Transformers for Portable Aerosol Mass Spectrometry
Kyle M. Regan, Michael McLoughlin, Wayne A. Bryden, Gonzalo R. Arce
TL;DR
The paper tackles real-time detection of airborne pathogens with portable aerosol MALDI-MS, where single-shot spectra are noisy and traditional averaging is impractical for field use. It introduces MS-DGFormer, a dual-stream transformer that leverages SVD-denoised dictionary subspaces as priors, enabling robust, single-shot multi-label classification by processing raw spectra and denoised dictionaries in parallel and fusing them via selection attention. The authors demonstrate state-of-the-art macro and micro performance on a field-relevant aerosol dataset and further improve efficiency with MS-DGFormer-E, a streamlined inference variant that reduces parameters and doubles throughput. This work supports real-time environmental biosurveillance with portable MALDI-ToF platforms, potentially enabling rapid response to biological threats in public spaces.
Abstract
Matrix Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-MS) is a cornerstone in biomolecular analysis, offering precise identification of pathogens through unique mass spectral signatures. Yet, its reliance on labor-intensive sample preparation and multi-shot spectral averaging restricts its use to laboratory settings, rendering it impractical for real-time environmental monitoring. These limitations are especially pronounced in emerging aerosol MALDI-MS systems, where autonomous sampling generates noisy spectra for unknown aerosol analytes, requiring single-shot detection for effective analysis. Addressing these challenges, we propose the Mass Spectral Dictionary-Guided Transformer (MS-DGFormer): a data-driven framework that redefines spectral analysis by directly processing raw, minimally prepared mass spectral data. MS-DGFormer leverages a transformer architecture, designed to capture the long-range dependencies inherent in these time-series spectra. To enhance feature extraction, we introduce a novel dictionary encoder that integrates denoised spectral information derived from Singular Value Decomposition (SVD), enabling the model to discern critical biomolecular patterns from single-shot spectra with robust performance. This innovation provides a system to achieve superior pathogen identification from aerosol samples, facilitating autonomous, real-time analysis in field conditions. By eliminating the need for extensive preprocessing, our method unlocks the potential for portable, deployable MALDI-MS platforms, revolutionizing environmental pathogen detection and rapid response to biological threats.
