Table of Contents
Fetching ...

Efficient Generation of Molecular Clusters with Dual-Scale Equivariant Flow Matching

Akshay Subramanian, Shuhui Qu, Cheol Woo Park, Sulin Liu, Janghwan Lee, Rafael Gómez-Bombarelli

TL;DR

The paper tackles the high computational cost of generating conformational ensembles for amorphous molecular solids by introducing a dual-scale flow matching framework. It employs two vector-field networks, $v_{\theta}$ for coarse-grained beads and $v_{\phi}$ for all-atom coordinates, to perform coarse-to-fine generation via conditional flow matching with separate training objectives and priors. Compared to single-scale flow matching, the dual-scale approach yields 15–25% improvements in bond-length and bond-angle distribution accuracy and achieves up to ~85% faster per-step inference on an A100 GPU, demonstrated on MD-derived Y6 clusters. This method enables scalable sampling for larger systems relevant to organic electronics, with future work aimed at exploring alternative coarse-graining mappings and broader performance metrics.

Abstract

Amorphous molecular solids offer a promising alternative to inorganic semiconductors, owing to their mechanical flexibility and solution processability. The packing structure of these materials plays a crucial role in determining their electronic and transport properties, which are key to enhancing the efficiency of devices like organic solar cells (OSCs). However, obtaining these optoelectronic properties computationally requires molecular dynamics (MD) simulations to generate a conformational ensemble, a process that can be computationally expensive due to the large system sizes involved. Recent advances have focused on using generative models, particularly flow-based models as Boltzmann generators, to improve the efficiency of MD sampling. In this work, we developed a dual-scale flow matching method that separates training and inference into coarse-grained and all-atom stages and enhances both the accuracy and efficiency of standard flow matching samplers. We demonstrate the effectiveness of this method on a dataset of Y6 molecular clusters obtained through MD simulations, and we benchmark its efficiency and accuracy against single-scale flow matching methods.

Efficient Generation of Molecular Clusters with Dual-Scale Equivariant Flow Matching

TL;DR

The paper tackles the high computational cost of generating conformational ensembles for amorphous molecular solids by introducing a dual-scale flow matching framework. It employs two vector-field networks, for coarse-grained beads and for all-atom coordinates, to perform coarse-to-fine generation via conditional flow matching with separate training objectives and priors. Compared to single-scale flow matching, the dual-scale approach yields 15–25% improvements in bond-length and bond-angle distribution accuracy and achieves up to ~85% faster per-step inference on an A100 GPU, demonstrated on MD-derived Y6 clusters. This method enables scalable sampling for larger systems relevant to organic electronics, with future work aimed at exploring alternative coarse-graining mappings and broader performance metrics.

Abstract

Amorphous molecular solids offer a promising alternative to inorganic semiconductors, owing to their mechanical flexibility and solution processability. The packing structure of these materials plays a crucial role in determining their electronic and transport properties, which are key to enhancing the efficiency of devices like organic solar cells (OSCs). However, obtaining these optoelectronic properties computationally requires molecular dynamics (MD) simulations to generate a conformational ensemble, a process that can be computationally expensive due to the large system sizes involved. Recent advances have focused on using generative models, particularly flow-based models as Boltzmann generators, to improve the efficiency of MD sampling. In this work, we developed a dual-scale flow matching method that separates training and inference into coarse-grained and all-atom stages and enhances both the accuracy and efficiency of standard flow matching samplers. We demonstrate the effectiveness of this method on a dataset of Y6 molecular clusters obtained through MD simulations, and we benchmark its efficiency and accuracy against single-scale flow matching methods.

Paper Structure

This paper contains 11 sections, 3 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Coarse-graining mapping scheme, and influence of CG:AA ratio on metrics. (a) Our coarse-graining mapping scheme for an individual Y6 molecule consisting of 13 beads identified with different colors. (b) JSD bond lengths, angles, and inference time on test data generation as a function of CG:AA ratio. As we increased the ratio, we observed a noticeable decreasing trend in inference time with negligible change in JSD values.