SpectraLLM: Uncovering the Ability of LLMs for Molecule Structure Elucidation from Multi-Spectral

Yunyue Su; Jiahui Chen; Zao Jiang; Zhenyi Zhong; Liang Wang; Qiang Liu; Zhaoxiang Zhang

SpectraLLM: Uncovering the Ability of LLMs for Molecule Structure Elucidation from Multi-Spectral

Yunyue Su, Jiahui Chen, Zao Jiang, Zhenyi Zhong, Liang Wang, Qiang Liu, Zhaoxiang Zhang

Abstract

Automated molecular structure elucidation remains challenging, as existing approaches often depend on pre-compiled databases or restrict themselves to single spectroscopic modalities. Here we introduce \textbf{SpectraLLM}, a large language model that performs end-to-end structure prediction by reasoning over one or multiple spectra. Unlike conventional spectrum-to-structure pipelines, SpectraLLM represents both continuous (IR, Raman, UV-Vis, NMR) and discrete (MS) modalities in a shared language space, enabling it to capture substructural patterns that are complementary across different spectral types. We pretrain and fine-tune the model on small-molecule domains and evaluate it on four public benchmark datasets. SpectraLLM achieves state-of-the-art performance, substantially surpassing single-modality baselines. Moreover, it demonstrates strong robustness in unimodal settings and further improves prediction accuracy when jointly reasoning over diverse spectra, establishing a scalable paradigm for language-based spectroscopic analysis. Code is available at https://github.com/OPilgrim/SpectraLLM.

SpectraLLM: Uncovering the Ability of LLMs for Molecule Structure Elucidation from Multi-Spectral

Abstract

SpectraLLM: Uncovering the Ability of LLMs for Molecule Structure Elucidation from Multi-Spectral

Abstract

Paper Structure

Table of Contents

Figures (6)