Table of Contents
Fetching ...

Flexible MOF Generation with Torsion-Aware Flow Matching

Nayoung Kim, Seongsu Kim, Sungsoo Ahn

TL;DR

MOFFlow-2 introduces a two-stage framework that jointly enables novel MOF chemistries and accurate 3D structure generation without a fixed block library or rigid conformations. The first stage autoregressively generates building blocks in SMILES form, while the second stage uses a torsion-aware flow model to predict translations, rotations, torsions, and lattice parameters for full 3D assembly. By explicitly modeling torsion angles and employing canonicalization and MOF matching to mitigate symmetry and distributional shifts, MOFFlow-2 achieves higher reconstruction accuracy and generates valid, novel, and diverse MOFs, including blocks unseen during training. The approach advances automated MOF design with practical improvements in both structure prediction and generative design, while acknowledging limitations related to initialization tools and energy evaluation. Overall, MOFFlow-2 represents a significant step toward flexible, scalable MOF discovery and design.

Abstract

Designing metal-organic frameworks (MOFs) with novel chemistries is a longstanding challenge due to their large combinatorial space and complex 3D arrangements of the building blocks. While recent deep generative models have enabled scalable MOF generation, they assume (1) a fixed set of building blocks and (2) known local 3D coordinates of building blocks. However, this limits their ability to (1) design novel MOFs and (2) generate the structure using novel building blocks. We propose a two-stage MOF generation framework that overcomes these limitations by modeling both chemical and geometric degrees of freedom. First, we train an SMILES-based autoregressive model to generate metal and organic building blocks, paired with a cheminformatics toolkit for 3D structure initialization. Second, we introduce a flow matching model that predicts translations, rotations, and torsional angles to assemble the blocks into valid 3D frameworks. Our experiments demonstrate improved reconstruction accuracy, the generation of valid, novel, and unique MOFs, and the ability to create novel building blocks. Our code is available at https://github.com/nayoung10/MOFFlow-2.

Flexible MOF Generation with Torsion-Aware Flow Matching

TL;DR

MOFFlow-2 introduces a two-stage framework that jointly enables novel MOF chemistries and accurate 3D structure generation without a fixed block library or rigid conformations. The first stage autoregressively generates building blocks in SMILES form, while the second stage uses a torsion-aware flow model to predict translations, rotations, torsions, and lattice parameters for full 3D assembly. By explicitly modeling torsion angles and employing canonicalization and MOF matching to mitigate symmetry and distributional shifts, MOFFlow-2 achieves higher reconstruction accuracy and generates valid, novel, and diverse MOFs, including blocks unseen during training. The approach advances automated MOF design with practical improvements in both structure prediction and generative design, while acknowledging limitations related to initialization tools and energy evaluation. Overall, MOFFlow-2 represents a significant step toward flexible, scalable MOF discovery and design.

Abstract

Designing metal-organic frameworks (MOFs) with novel chemistries is a longstanding challenge due to their large combinatorial space and complex 3D arrangements of the building blocks. While recent deep generative models have enabled scalable MOF generation, they assume (1) a fixed set of building blocks and (2) known local 3D coordinates of building blocks. However, this limits their ability to (1) design novel MOFs and (2) generate the structure using novel building blocks. We propose a two-stage MOF generation framework that overcomes these limitations by modeling both chemical and geometric degrees of freedom. First, we train an SMILES-based autoregressive model to generate metal and organic building blocks, paired with a cheminformatics toolkit for 3D structure initialization. Second, we introduce a flow matching model that predicts translations, rotations, and torsional angles to assemble the blocks into valid 3D frameworks. Our experiments demonstrate improved reconstruction accuracy, the generation of valid, novel, and unique MOFs, and the ability to create novel building blocks. Our code is available at https://github.com/nayoung10/MOFFlow-2.

Paper Structure

This paper contains 36 sections, 15 equations, 10 figures, 11 tables, 10 algorithms.

Figures (10)

  • Figure 1: Overview of MOFFlow-2.MOFFlow-2 is a two-stage generative framework for MOF generation and structure prediction. The first stage uses a building block generator to generate an MOF sequence in SMILES representation, which is initialized to 3D coordinates with the metal library and RDKit. In the second stage, our structure prediction model assembles these building blocks by modeling translation $\bm{\tau}$, rotation $\bm{q}$, torsion $\bm{\phi}$, and lattice $\bm{\ell}$.
  • Figure 2: Structure prediction model architecture. The model consists of three main modules: an initialization module that encodes atom features; an interaction module based on a Transformer encoder; and an output module with four prediction heads for each structural component -- i.e., rotation (dimension $M$, the number of building blocks), translation ($M$), lattice parameters ($1$), and torsion angles (dimension $P$, the number of rotatable bonds). Notably, torsion angles are predicted by first constructing rotatable bond features with four corresponding dihedral atoms and updating the feature with attention to nearby atoms.
  • Figure 3: Canonicalization. MOF building blocks often exhibit high symmetry point groups and $\pi$-symmetry bonds, resulting in multiple valid rotations and torsion targets. This ambiguity can lead to unstable training. To resolve this, (\ref{['fig:canon_rot']}) we assign a unique rotation target by finding the closest rotation in terms of RMSD. (\ref{['fig:canon_tor']}) For torsions, we uniquely define the neighboring atoms for a rotatable bond with canonical atom rankings of RDKit.
  • Figure 4: Property distributions. We compare MOF property distributions of the ground-truth, MOFDiff, and MOFFlow-2. The distribution has been smoothed with kernel density estimation. Compared to MOFDiff, MOFFlow-2 closely aligns with the ground-truth distribution and covers a broader range of values, demonstrating that MOFFlow-2 can generate MOFs with diverse properties.
  • Figure 5: Samples from MOFFlow-2. Visualizations of samples generated by MOFFlow-2 that are valid, novel, and unique.
  • ...and 5 more figures