MolMiner: Towards Controllable, 3D-Aware, Fragment-Based Molecular Design
Raul Ortega-Ochoa, Tejs Vegge, Jes Frellsen
TL;DR
MolMiner tackles inverse molecular design by delivering a flexible, controllable generator that combines fragment-based construction with dynamic 3D geometry. It introduces an order-agnostic rollout and symmetry-aware attachment, enabling generation conditioned on up to twelve properties, with a Gaussian Mixture Model prior to sample missing conditioning values. The paper demonstrates calibrated conditional generation across most properties and competitive unconditional performance, supported by new benchmarking protocols based on Wasserstein distances and calibration plots. The work advances practical, interpretable, multi-property molecular design and has potential impact in materials discovery, drug design, and green chemistry.
Abstract
We introduce MolMiner, a fragment-based, geometry-aware, and order-agnostic autoregressive model for molecular design. MolMiner supports conditional generation of molecules over twelve properties, enabling flexible control across physicochemical and structural targets. Molecules are built via symmetry-aware fragment attachments, with 3D geometry dynamically updated during generation using forcefields. A probabilistic conditioning mechanism allows users to specify any subset of target properties while sampling the rest. MolMiner achieves calibrated conditional generation across most properties and offers competitive unconditional performance. We also propose improved benchmarking methods for both unconditional and conditional generation, including distributional comparisons via Wasserstein distance and calibration plots for property control. To our knowledge, this is the first model to unify dynamic geometry, symmetry handling, order-agnostic fragment-based generation, and high-dimensional multi-property conditioning.
