Leveraging Latent Evolutionary Optimization for Targeted Molecule Generation
Siddartha Reddy N, Sai Prakash MV, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan
TL;DR
LEOMol introduces latent evolution in a VAE latent space to enable targeted molecule generation with non-differentiable property predictors. By integrating Genetic Algorithm and Differential Evolution search within a SELFIES-based VAE trained on ZINC250k, LEOMol achieves superior performance on property optimization, targeting, and constrained optimization tasks compared with several state-of-the-art baselines. The approach emphasizes toxicity-aware optimization, demonstrating that enforcing a toxicity constraint yields fully non-toxic, diverse molecules without sacrificing key drug-likeness metrics. This framework offers a fast, flexible alternative for hit generation and lead optimization, with potential to accelerate early-stage drug discovery by leveraging non-differentiable oracles in latent-space exploration.
Abstract
Lead optimization is a pivotal task in the drug design phase within the drug discovery lifecycle. The primary objective is to refine the lead compound to meet specific molecular properties for progression to the subsequent phase of development. In this work, we present an innovative approach, Latent Evolutionary Optimization for Molecule Generation (LEOMol), a generative modeling framework for the efficient generation of optimized molecules. LEOMol leverages Evolutionary Algorithms, such as Genetic Algorithm and Differential Evolution, to search the latent space of a Variational AutoEncoder (VAE). This search facilitates the identification of the target molecule distribution within the latent space. Our approach consistently demonstrates superior performance compared to previous state-of-the-art models across a range of constrained molecule generation tasks, outperforming existing models in all four sub-tasks related to property targeting. Additionally, we suggest the importance of including toxicity in the evaluation of generative models. Furthermore, an ablation study underscores the improvements that our approach provides over gradient-based latent space optimization methods. This underscores the effectiveness and superiority of LEOMol in addressing the inherent challenges in constrained molecule generation while emphasizing its potential to propel advancements in drug discovery.
