Table of Contents
Fetching ...

NEAT: Neighborhood-Guided, Efficient, Autoregressive Set Transformer for 3D Molecular Generation

Daniel Rose, Roxane Axel Jacob, Johannes Kirchmair, Thierry Langer

TL;DR

<3-5 sentence high-level summary> This paper introduces NEAT, a permutation-invariant autoregressive generator for 3D molecules that treats molecules as atom sets and uses neighborhood-guided supervision with a set-transformer trunk and flow-based coordinate modeling. By avoiding canonical atom orderings and employing neighborhood continuations, NEAT achieves order-agnostic next-token predictions with efficient, batched inference. On QM9, NEAT shows competitive performance across stability, validity, and uniqueness metrics while delivering faster sampling than diffusion-based methods. This approach provides a scalable foundation for conditional de novo molecular design and autoregressive 3D generation without vector quantization.

Abstract

Autoregressive models are a promising alternative to diffusion-based models for 3D molecular structure generation. However, a key limitation is the assumption of a token order: while text has a natural sequential order, the next token prediction given a molecular graph prefix should be invariant to atom permutations. Previous works sidestepped this mismatch by using canonical orders or focus atoms. We argue that this is unnecessary. We introduce NEAT, a Neighborhood-guided, Efficient, Autoregressive, Set Transformer that treats molecular graphs as sets of atoms and learns the order-agnostic distribution over admissible tokens at the graph boundary with an autoregressive flow model. NEAT approaches state-of-the-art performance in 3D molecular generation with high computational efficiency and atom-level permutation invariance, establishing a practical foundation for scalable molecular design.

NEAT: Neighborhood-Guided, Efficient, Autoregressive Set Transformer for 3D Molecular Generation

TL;DR

<3-5 sentence high-level summary> This paper introduces NEAT, a permutation-invariant autoregressive generator for 3D molecules that treats molecules as atom sets and uses neighborhood-guided supervision with a set-transformer trunk and flow-based coordinate modeling. By avoiding canonical atom orderings and employing neighborhood continuations, NEAT achieves order-agnostic next-token predictions with efficient, batched inference. On QM9, NEAT shows competitive performance across stability, validity, and uniqueness metrics while delivering faster sampling than diffusion-based methods. This approach provides a scalable foundation for conditional de novo molecular design and autoregressive 3D generation without vector quantization.

Abstract

Autoregressive models are a promising alternative to diffusion-based models for 3D molecular structure generation. However, a key limitation is the assumption of a token order: while text has a natural sequential order, the next token prediction given a molecular graph prefix should be invariant to atom permutations. Previous works sidestepped this mismatch by using canonical orders or focus atoms. We argue that this is unnecessary. We introduce NEAT, a Neighborhood-guided, Efficient, Autoregressive, Set Transformer that treats molecular graphs as sets of atoms and learns the order-agnostic distribution over admissible tokens at the graph boundary with an autoregressive flow model. NEAT approaches state-of-the-art performance in 3D molecular generation with high computational efficiency and atom-level permutation invariance, establishing a practical foundation for scalable molecular design.

Paper Structure

This paper contains 34 sections, 7 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of the proposed training workflow. The model takes molecular graphs as input, where nodes encode atom types and positions, and edges represent chemical bonds. For each training example, we randomly sample a connected subgraph. The selected nodes form the source set, and their neighboring boundary nodes define the target set. The source set is encoded using a set transformer, and its representation is used to predict the distribution over the next atom type. For each target-set node, we sample a position from a normal distribution and construct linear interpolation paths between the sampled and true positions. These interpolation paths provide supervision for learning the vector field via flow matching.
  • Figure 2: Overview of the proposed model inference workflow.
  • Figure 3: Randomly selected examples of generated molecules (white: hydrogen, gray: carbon, purple: nitrogen, red: oxygen).
  • Figure 4: Relative and absolute size distribution of the source and target set w.r.t. the original graphs. Sampling was performed with $\beta=1.5$ and $\gamma=0.55$.
  • Figure 5: Randomly selected molecules generated by NEAT after being trained on the QM9 dataset (white: hydrogen, gray: carbon, blue: nitrogen, red: oxygen, green: fluorine). 2D plots of the same molecules are shown in the next figure.
  • ...and 1 more figures