MING: A Functional Approach to Learning Molecular Generative Models
Van Khoa Nguyen, Maciej Falkiewicz, Giangiacomo Mercatali, Alexandros Kalousis
TL;DR
MING tackles molecule generation by learning distributions directly in function space rather than graphs or sequences. It introduces a diffusion process over molecular function evaluations, coupled with an EM-based denoising objective and TwinINR denoisers to operate on irregular domains defined by graph spectral coordinates. The approach yields a lightweight, fast generator that achieves competitive or superior validity, novelty, and distribution alignment (e.g., FCD, NSPDK) across QM9, ZINC250k, and MOSES, while reducing parameter count and inference time. This functional perspective offers a scalable alternative to graph-equivariant architectures and points to broader applications of function-space generative modeling in chemistry.
Abstract
Traditional molecule generation methods often rely on sequence- or graph-based representations, which can limit their expressive power or require complex permutation-equivariant architectures. This paper introduces a novel paradigm for learning molecule generative models based on functional representations. Specifically, we propose Molecular Implicit Neural Generation (MING), a diffusion-based model that learns molecular distributions in the function space. Unlike standard diffusion processes in the data space, MING employs a novel functional denoising probabilistic process, which jointly denoises information in both the function's input and output spaces by leveraging an expectation-maximization procedure for latent implicit neural representations of data. This approach enables a simple yet effective model design that accurately captures underlying function distributions. Experimental results on molecule-related datasets demonstrate MING's superior performance and ability to generate plausible molecular samples, surpassing state-of-the-art data-space methods while offering a more streamlined architecture and significantly faster generation times. The code is available at https://github.com/v18nguye/MING.
