Table of Contents
Fetching ...

Leveraging Latent Evolutionary Optimization for Targeted Molecule Generation

Siddartha Reddy N, Sai Prakash MV, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan

TL;DR

LEOMol introduces latent evolution in a VAE latent space to enable targeted molecule generation with non-differentiable property predictors. By integrating Genetic Algorithm and Differential Evolution search within a SELFIES-based VAE trained on ZINC250k, LEOMol achieves superior performance on property optimization, targeting, and constrained optimization tasks compared with several state-of-the-art baselines. The approach emphasizes toxicity-aware optimization, demonstrating that enforcing a toxicity constraint yields fully non-toxic, diverse molecules without sacrificing key drug-likeness metrics. This framework offers a fast, flexible alternative for hit generation and lead optimization, with potential to accelerate early-stage drug discovery by leveraging non-differentiable oracles in latent-space exploration.

Abstract

Lead optimization is a pivotal task in the drug design phase within the drug discovery lifecycle. The primary objective is to refine the lead compound to meet specific molecular properties for progression to the subsequent phase of development. In this work, we present an innovative approach, Latent Evolutionary Optimization for Molecule Generation (LEOMol), a generative modeling framework for the efficient generation of optimized molecules. LEOMol leverages Evolutionary Algorithms, such as Genetic Algorithm and Differential Evolution, to search the latent space of a Variational AutoEncoder (VAE). This search facilitates the identification of the target molecule distribution within the latent space. Our approach consistently demonstrates superior performance compared to previous state-of-the-art models across a range of constrained molecule generation tasks, outperforming existing models in all four sub-tasks related to property targeting. Additionally, we suggest the importance of including toxicity in the evaluation of generative models. Furthermore, an ablation study underscores the improvements that our approach provides over gradient-based latent space optimization methods. This underscores the effectiveness and superiority of LEOMol in addressing the inherent challenges in constrained molecule generation while emphasizing its potential to propel advancements in drug discovery.

Leveraging Latent Evolutionary Optimization for Targeted Molecule Generation

TL;DR

LEOMol introduces latent evolution in a VAE latent space to enable targeted molecule generation with non-differentiable property predictors. By integrating Genetic Algorithm and Differential Evolution search within a SELFIES-based VAE trained on ZINC250k, LEOMol achieves superior performance on property optimization, targeting, and constrained optimization tasks compared with several state-of-the-art baselines. The approach emphasizes toxicity-aware optimization, demonstrating that enforcing a toxicity constraint yields fully non-toxic, diverse molecules without sacrificing key drug-likeness metrics. This framework offers a fast, flexible alternative for hit generation and lead optimization, with potential to accelerate early-stage drug discovery by leveraging non-differentiable oracles in latent-space exploration.

Abstract

Lead optimization is a pivotal task in the drug design phase within the drug discovery lifecycle. The primary objective is to refine the lead compound to meet specific molecular properties for progression to the subsequent phase of development. In this work, we present an innovative approach, Latent Evolutionary Optimization for Molecule Generation (LEOMol), a generative modeling framework for the efficient generation of optimized molecules. LEOMol leverages Evolutionary Algorithms, such as Genetic Algorithm and Differential Evolution, to search the latent space of a Variational AutoEncoder (VAE). This search facilitates the identification of the target molecule distribution within the latent space. Our approach consistently demonstrates superior performance compared to previous state-of-the-art models across a range of constrained molecule generation tasks, outperforming existing models in all four sub-tasks related to property targeting. Additionally, we suggest the importance of including toxicity in the evaluation of generative models. Furthermore, an ablation study underscores the improvements that our approach provides over gradient-based latent space optimization methods. This underscores the effectiveness and superiority of LEOMol in addressing the inherent challenges in constrained molecule generation while emphasizing its potential to propel advancements in drug discovery.
Paper Structure (15 sections, 9 equations, 2 figures, 4 tables, 2 algorithms)

This paper contains 15 sections, 9 equations, 2 figures, 4 tables, 2 algorithms.

Figures (2)

  • Figure 1: Overview of the proposed LEOMol method: (a) A Variational AutoEncoder (VAE) is pre-trained to acquire the ability to reconstruct and generate drug-like molecules using the SELFIES molecule representation. The pre-trained VAE is subsequently employed in conjunction with (b) Genetic Algorithm and (c) Differential Evolution search strategies to explore the latent space, aiming to optimize the desired molecule.
  • Figure 2: Density plots illustrating property scores of molecules produced through Genetic Algorithm, Gradient Descent Algorithm searches and Random sampling technique within the VAE latent space for tasks involving QED score Maximization and SA score Minimization.