Table of Contents
Fetching ...

Graph Diffusion Transformers are In-Context Molecular Designers

Gang Liu, Jie Chen, Yihan Zhu, Michael Sun, Tengfei Luo, Nitesh V Chawla, Meng Jiang

TL;DR

DemoDiff tackles in-context molecular design by conditioning diffusion-based molecule generation on demonstrations of molecule–score pairs, enabling latent task concept inference. It introduces Node Pair Encoding to produce a motif-level tokenizer that reduces graph size by about $5.5\times$ while preserving reconstructability. A $0.7$B DemoDiff model is pre-trained on over $1.6$ million tasks drawn from ChEMBL and polymer datasets and evaluated on 33 design tasks across six categories, where it matches or surpasses language models millions of parameters larger ($10^2$–$10^3\times$ larger) and outperforms domain-specific baselines. Additionally, a consistency score filters generations to improve reliability, positioning demonstration-conditioned diffusion as a scalable molecular foundation approach.

Abstract

In-context learning allows large models to adapt to new tasks from a few demonstrations, but it has shown limited success in molecular design. Existing databases such as ChEMBL contain molecular properties spanning millions of biological assays, yet labeled data for each property remain scarce. To address this limitation, we introduce demonstration-conditioned diffusion models (DemoDiff), which define task contexts using a small set of molecule-score examples instead of text descriptions. These demonstrations guide a denoising Transformer to generate molecules aligned with target properties. For scalable pretraining, we develop a new molecular tokenizer with Node Pair Encoding that represents molecules at the motif level, requiring 5.5$\times$ fewer nodes. We curate a dataset containing millions of context tasks from multiple sources covering both drugs and materials, and pretrain a 0.7-billion-parameter model on it. Across 33 design tasks in six categories, DemoDiff matches or surpasses language models 100-1000$\times$ larger and achieves an average rank of 3.63 compared to 5.25-10.20 for domain-specific approaches. These results position DemoDiff as a molecular foundation model for in-context molecular design. Our code is available at https://github.com/liugangcode/DemoDiff.

Graph Diffusion Transformers are In-Context Molecular Designers

TL;DR

DemoDiff tackles in-context molecular design by conditioning diffusion-based molecule generation on demonstrations of molecule–score pairs, enabling latent task concept inference. It introduces Node Pair Encoding to produce a motif-level tokenizer that reduces graph size by about while preserving reconstructability. A B DemoDiff model is pre-trained on over million tasks drawn from ChEMBL and polymer datasets and evaluated on 33 design tasks across six categories, where it matches or surpasses language models millions of parameters larger ( larger) and outperforms domain-specific baselines. Additionally, a consistency score filters generations to improve reliability, positioning demonstration-conditioned diffusion as a scalable molecular foundation approach.

Abstract

In-context learning allows large models to adapt to new tasks from a few demonstrations, but it has shown limited success in molecular design. Existing databases such as ChEMBL contain molecular properties spanning millions of biological assays, yet labeled data for each property remain scarce. To address this limitation, we introduce demonstration-conditioned diffusion models (DemoDiff), which define task contexts using a small set of molecule-score examples instead of text descriptions. These demonstrations guide a denoising Transformer to generate molecules aligned with target properties. For scalable pretraining, we develop a new molecular tokenizer with Node Pair Encoding that represents molecules at the motif level, requiring 5.5 fewer nodes. We curate a dataset containing millions of context tasks from multiple sources covering both drugs and materials, and pretrain a 0.7-billion-parameter model on it. Across 33 design tasks in six categories, DemoDiff matches or surpasses language models 100-1000 larger and achieves an average rank of 3.63 compared to 5.25-10.20 for domain-specific approaches. These results position DemoDiff as a molecular foundation model for in-context molecular design. Our code is available at https://github.com/liugangcode/DemoDiff.

Paper Structure

This paper contains 29 sections, 10 equations, 16 figures, 17 tables, 1 algorithm.

Figures (16)

  • Figure 1: In-context molecular design with DemoDiff. Each demo is defined as a score–molecule pair, and a set of them forms the task context as conditions. After pretraining on large and diverse tasks, DemoDiff serves as a foundation model for designing molecules in new task contexts. Scores represent relative distances to the target and are converted from raw labels, as shown in \ref{['subsec:pretrain']}.
  • Figure 2: Demonstration-conditioned diffusion generation. In the reverse process, DemoDiff starts from random noise and denoises molecules conditioned on a set of molecule–score demonstration pairs at the motif level, with a tokenizer bridging motif and atom representations.
  • Figure 3: Pretraining data statistics for property rank-frequency and node count density.
  • Figure 4: Ablation studies on Albuterol drug rediscovery.
  • Figure 5: Ablation studies on context consistency scores: (1) left shows improvements; (2) right shows the relationship between consistency score and oracle scores.
  • ...and 11 more figures