A Graph Based Raman Spectral Processing Technique for Exosome Classification
Vuong M. Ngo, Edward Bolger, Stan Goodwin, John O'Sullivan, Dinh Viet Cuong, Mark Roantree
TL;DR
This work addresses exosome classification from Raman spectra using a graph-based spectral processing framework. It builds a Neo4j graph model where spectral peaks are nodes and centrality-informed filtering (PageRank) is combined with Optimal Spectral Cleaning and dimensionality reduction to improve feature selection. The OSC, OSC+PRF, and OSC+PRF+DR pipeline, paired with Extra Trees, achieves its best performance on surfaces ($0.857$ accuracy) and spectra ($0.760$ accuracy) under group 10-fold cross-validation, with surface data outperforming raw spectra. Overall, the graph-based spectral filtering approach reduces noise while preserving biomarker signals, enhancing Raman-based exosome analysis for disease diagnostics and biomarker discovery.
Abstract
Exosomes are small vesicles crucial for cell signaling and disease biomarkers. Due to their complexity, an "omics" approach is preferable to individual biomarkers. While Raman spectroscopy is effective for exosome analysis, it requires high sample concentrations and has limited sensitivity to lipids and proteins. Surface-enhanced Raman spectroscopy helps overcome these challenges. In this study, we leverage Neo4j graph databases to organize 3,045 Raman spectra of exosomes, enhancing data generalization. To further refine spectral analysis, we introduce a novel spectral filtering process that integrates the PageRank Filter with optimal Dimensionality Reduction. This method improves feature selection, resulting in superior classification performance. Specifically, the Extra Trees model, using our spectral processing approach, achieves 0.76 and 0.857 accuracy in classifying hyperglycemic, hypoglycemic, and normal exosome samples based on Raman spectra and surface, respectively, with group 10-fold cross-validation. Our results show that graph-based spectral filtering combined with optimal dimensionality reduction significantly improves classification accuracy by reducing noise while preserving key biomarker signals. This novel framework enhances Raman-based exosome analysis, expanding its potential for biomedical applications, disease diagnostics, and biomarker discovery.
