Table of Contents
Fetching ...

MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction

Yuyan Liu, Sirui Ding, Sheng Zhou, Wenqi Fan, Qiaoyu Tan

TL;DR

MolecularGPT introduces an instruction-tuned LLM framework for few-shot molecular property prediction by using SMILES-based prompts, structure-aware neighbor demonstrations, and a hybrid zero-/few-shot instruction set. It demonstrates strong zero-shot and few-shot performance across MoleculeNet and CYP450 benchmarks, surpassing several LM-based and GNN baselines, and provides extensive analyses of prompt design, retrieval strategy, and instruction scaling. The approach highlights the potential of large language models to generalize to unseen molecular tasks with minimal fine-tuning, though it remains SMILES-centric and focused on property prediction rather than generation or 3D-aware reasoning. The work offers practical insights for deploying LLM-based MPP systems and points to future work in integrating additional molecular modalities and tasks.

Abstract

Molecular property prediction (MPP) is a fundamental and crucial task in drug discovery. However, prior methods are limited by the requirement for a large number of labeled molecules and their restricted ability to generalize for unseen and new tasks, both of which are essential for real-world applications. To address these challenges, we present MolecularGPT for few-shot MPP. From a perspective on instruction tuning, we fine-tune large language models (LLMs) based on curated molecular instructions spanning over 1000 property prediction tasks. This enables building a versatile and specialized LLM that can be adapted to novel MPP tasks without any fine-tuning through zero- and few-shot in-context learning (ICL). MolecularGPT exhibits competitive in-context reasoning capabilities across 10 downstream evaluation datasets, setting new benchmarks for few-shot molecular prediction tasks. More importantly, with just two-shot examples, MolecularGPT can outperform standard supervised graph neural network methods on 4 out of 7 datasets. It also excels state-of-the-art LLM baselines by up to 15.7% increase on classification accuracy and decrease of 17.9 on regression metrics (e.g., RMSE) under zero-shot. This study demonstrates the potential of LLMs as effective few-shot molecular property predictors. The code is available at https://github.com/NYUSHCS/MolecularGPT.

MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction

TL;DR

MolecularGPT introduces an instruction-tuned LLM framework for few-shot molecular property prediction by using SMILES-based prompts, structure-aware neighbor demonstrations, and a hybrid zero-/few-shot instruction set. It demonstrates strong zero-shot and few-shot performance across MoleculeNet and CYP450 benchmarks, surpassing several LM-based and GNN baselines, and provides extensive analyses of prompt design, retrieval strategy, and instruction scaling. The approach highlights the potential of large language models to generalize to unseen molecular tasks with minimal fine-tuning, though it remains SMILES-centric and focused on property prediction rather than generation or 3D-aware reasoning. The work offers practical insights for deploying LLM-based MPP systems and points to future work in integrating additional molecular modalities and tasks.

Abstract

Molecular property prediction (MPP) is a fundamental and crucial task in drug discovery. However, prior methods are limited by the requirement for a large number of labeled molecules and their restricted ability to generalize for unseen and new tasks, both of which are essential for real-world applications. To address these challenges, we present MolecularGPT for few-shot MPP. From a perspective on instruction tuning, we fine-tune large language models (LLMs) based on curated molecular instructions spanning over 1000 property prediction tasks. This enables building a versatile and specialized LLM that can be adapted to novel MPP tasks without any fine-tuning through zero- and few-shot in-context learning (ICL). MolecularGPT exhibits competitive in-context reasoning capabilities across 10 downstream evaluation datasets, setting new benchmarks for few-shot molecular prediction tasks. More importantly, with just two-shot examples, MolecularGPT can outperform standard supervised graph neural network methods on 4 out of 7 datasets. It also excels state-of-the-art LLM baselines by up to 15.7% increase on classification accuracy and decrease of 17.9 on regression metrics (e.g., RMSE) under zero-shot. This study demonstrates the potential of LLMs as effective few-shot molecular property predictors. The code is available at https://github.com/NYUSHCS/MolecularGPT.
Paper Structure (23 sections, 3 equations, 4 figures, 13 tables)

This paper contains 23 sections, 3 equations, 4 figures, 13 tables.

Figures (4)

  • Figure 1: The proposed MolecularGPT framework. To instructionally fine-tune LLMs for MPP tasks, we construct a hybrid instruction set that includes both zero-shot and few-shot instructions across more than 1000 property tasks. Each few-shot instruction adaptively selects the query molecule's top-$k$ neighboring molecules as labeled demonstrations for prompt design.
  • Figure 2: The performance on Cyp450 test dataset.
  • Figure 3: The performance of MolecularGPT on Classifcation (Cls) and Regression (Reg) tasks tuning with different types of instruction datasets. We inference them with 0-shot and 2-shot examples. (0&4-shot indicates hybrid of $0$ and $4$-shot. 0-4-shot indicates mix of 0,1,2,3,4-shot. tuning_double indicates double the instruction set size.)
  • Figure 4: The performance of MolecularGPT on Classifcation (Cls) and Regrassion (Reg) tasks with different in-context inference strategies. To show our model's remarkable capability, we also add the performance of the finetuned model, GAT.