Table of Contents
Fetching ...

Bayesian modelling and quantification of Raman spectroscopy

Matthew Moores, Kirsten Gracie, Jake Carson, Karen Faulds, Duncan Graham, Mark Girolami

TL;DR

The paper addresses robust quantification in Raman spectroscopy despite large nonuniform baselines by proposing a Bayesian multivariate calibration model that decomposes spectra into a smooth baseline and a set of Gaussian or Lorentzian peaks, with peak amplitudes linked to molecular concentrations. Inference is performed with a likelihood-tempered sequential Monte Carlo (SMC) algorithm that jointly estimates baseline and peak parameters, yielding posterior distributions for quantities of scientific interest such as amplitudes, FWHM, and the limit of detection (LOD), e.g., $c_{LOD} = \frac{3 \sigma_\epsilon}{\beta_p}$. The method is implemented in the open-source R package serrsBayes, enabling simultaneous, uncertainty-quantified calibration across multiple peaks. The approach improves LOD estimation and uncertainty quantification for multiplexed Raman spectra and has potential extensions to Raman mapping and high-throughput analysis.

Abstract

Raman spectroscopy can be used to identify molecules such as DNA by the characteristic scattering of light from a laser. It is sensitive at very low concentrations and can accurately quantify the amount of a given molecule in a sample. The presence of a large, nonuniform background presents a major challenge to analysis of these spectra. To overcome this challenge, we introduce a sequential Monte Carlo (SMC) algorithm to separate the observed spectrum into a series of peaks plus a smoothly-varying baseline, corrupted by additive white noise. The peaks are modelled using Lorentzian or Gaussian broadening functions, while the baseline is estimated using a penalised cubic spline. This latent continuous representation accounts for differences in resolution between measurements. By incorporating this representation in a Bayesian model, we can quantify the relationship between molecular concentration and peak intensity, thereby providing an improved estimate of the limit of detection (LOD), which is of major importance in analytical chemistry.

Bayesian modelling and quantification of Raman spectroscopy

TL;DR

The paper addresses robust quantification in Raman spectroscopy despite large nonuniform baselines by proposing a Bayesian multivariate calibration model that decomposes spectra into a smooth baseline and a set of Gaussian or Lorentzian peaks, with peak amplitudes linked to molecular concentrations. Inference is performed with a likelihood-tempered sequential Monte Carlo (SMC) algorithm that jointly estimates baseline and peak parameters, yielding posterior distributions for quantities of scientific interest such as amplitudes, FWHM, and the limit of detection (LOD), e.g., . The method is implemented in the open-source R package serrsBayes, enabling simultaneous, uncertainty-quantified calibration across multiple peaks. The approach improves LOD estimation and uncertainty quantification for multiplexed Raman spectra and has potential extensions to Raman mapping and high-throughput analysis.

Abstract

Raman spectroscopy can be used to identify molecules such as DNA by the characteristic scattering of light from a laser. It is sensitive at very low concentrations and can accurately quantify the amount of a given molecule in a sample. The presence of a large, nonuniform background presents a major challenge to analysis of these spectra. To overcome this challenge, we introduce a sequential Monte Carlo (SMC) algorithm to separate the observed spectrum into a series of peaks plus a smoothly-varying baseline, corrupted by additive white noise. The peaks are modelled using Lorentzian or Gaussian broadening functions, while the baseline is estimated using a penalised cubic spline. This latent continuous representation accounts for differences in resolution between measurements. By incorporating this representation in a Bayesian model, we can quantify the relationship between molecular concentration and peak intensity, thereby providing an improved estimate of the limit of detection (LOD), which is of major importance in analytical chemistry.

Paper Structure

This paper contains 5 sections, 10 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Raman spectrum of ethanol (EtOH), showing the locations of 6 major peaks (430, 880, 1055, 1090, 1280 & 1460 cm$^{-1}$).
  • Figure 2: Informative priors for the scale parameters of Raman peaks, derived from manual baseline correction and peak fitting of Cy3, TAMRA and FAM spectra using Grams/AI 7.00.
  • Figure 3: Surface-enhanced Raman scattering (SERS) spectra and model fit at very low concentrations of tetramethylrhodamine (TAMRA) dye.