RamPINN: Recovering Raman Spectra From Coherent Anti-Stokes Spectra Using Embedded Physics
Sai Karthikeya Vemuri, Adithya Ashok Chalain Valapil, Tim Büchner, Joachim Denzler
TL;DR
RamPINN tackles the ill-posed problem of recovering Raman spectra from CARS measurements by disentangling resonant Raman signals from the non-resonant background using physics-informed losses. It uses a dual-decoder 1D U‑Net architecture and enforces Kramers-Kronig causality via a differentiable Hilbert-transform loss, together with a smoothness prior on NRB. Trained solely on synthetic data, RamPINN achieves strong zero-shot generalization to six real molecules and outperforms purely data-driven baselines, with a self-supervised variant remaining competitive. This work demonstrates that embedding established physical laws into neural networks provides a principled, robust inductive bias for data-limited scientific inverse problems, with broad implications for spectroscopic reconstruction and beyond.
Abstract
Transferring the recent advancements in deep learning into scientific disciplines is hindered by the lack of the required large-scale datasets for training. We argue that in these knowledge-rich domains, the established body of scientific theory provides reliable inductive biases in the form of governing physical laws. We address the ill-posed inverse problem of recovering Raman spectra from noisy Coherent Anti-Stokes Raman Scattering (CARS) measurements, as the true Raman signal here is suppressed by a dominating non-resonant background. We propose RamPINN, a model that learns to recover Raman spectra from given CARS spectra. Our core methodological contribution is a physics-informed neural network that utilizes a dual-decoder architecture to disentangle resonant and non-resonant signals. This is done by enforcing the Kramers-Kronig causality relations via a differentiable Hilbert transform loss on the resonant and a smoothness prior on the non-resonant part of the signal. Trained entirely on synthetic data, RamPINN demonstrates strong zero-shot generalization to real-world experimental data, explicitly closing this gap and significantly outperforming existing baselines. Furthermore, we show that training with these physics-based losses alone, without access to any ground-truth Raman spectra, still yields competitive results. This work highlights a broader concept: formal scientific rules can act as a potent inductive bias, enabling robust, self-supervised learning in data-limited scientific domains.
