Table of Contents
Fetching ...

Balancing property optimization and constraint satisfaction for constrained multi-property molecular optimization

Xin Xia, Yajie Zhang, Xiangxiang Zeng, Xingyi Zhang, Chunhou Zheng, Yansen Su

TL;DR

Experimental results show the superior performance of the proposed CMOMO over five state-of-the-art molecular optimization methods on two benchmark tasks of simultaneously optimizing multiple non-biological activity properties while satisfying two structural constraints.

Abstract

Molecular optimization, which aims to discover improved molecules from a vast chemical search space, is a critical step in chemical development. Various artificial intelligence technologies have demonstrated high effectiveness and efficiency on molecular optimization tasks. However, few of these technologies focus on balancing property optimization with constraint satisfaction, making it difficult to obtain high-quality molecules that not only possess desirable properties but also meet various constraints. To address this issue, we propose a constrained multi-property molecular optimization framework (CMOMO), which is a flexible and efficient method to simultaneously optimize multiple molecular properties while satisfying several drug-like constraints. CMOMO improves multiple properties of molecules with constraints based on dynamic cooperative optimization, which dynamically handles the constraints across various scenarios. Besides, CMOMO evaluates multiple properties within discrete chemical spaces cooperatively with the evolution of molecules within an implicit molecular space to guide the evolutionary search. Experimental results show the superior performance of the proposed CMOMO over five state-of-the-art molecular optimization methods on two benchmark tasks of simultaneously optimizing multiple non-biological activity properties while satisfying two structural constraints. Furthermore, the practical applicability of CMOMO is verified on two practical tasks, where it identified a collection of candidate ligands of $β$2-adrenoceptor GPCR and candidate inhibitors of glycogen synthase kinase-3$β$ with high properties and under drug-like constraints.

Balancing property optimization and constraint satisfaction for constrained multi-property molecular optimization

TL;DR

Experimental results show the superior performance of the proposed CMOMO over five state-of-the-art molecular optimization methods on two benchmark tasks of simultaneously optimizing multiple non-biological activity properties while satisfying two structural constraints.

Abstract

Molecular optimization, which aims to discover improved molecules from a vast chemical search space, is a critical step in chemical development. Various artificial intelligence technologies have demonstrated high effectiveness and efficiency on molecular optimization tasks. However, few of these technologies focus on balancing property optimization with constraint satisfaction, making it difficult to obtain high-quality molecules that not only possess desirable properties but also meet various constraints. To address this issue, we propose a constrained multi-property molecular optimization framework (CMOMO), which is a flexible and efficient method to simultaneously optimize multiple molecular properties while satisfying several drug-like constraints. CMOMO improves multiple properties of molecules with constraints based on dynamic cooperative optimization, which dynamically handles the constraints across various scenarios. Besides, CMOMO evaluates multiple properties within discrete chemical spaces cooperatively with the evolution of molecules within an implicit molecular space to guide the evolutionary search. Experimental results show the superior performance of the proposed CMOMO over five state-of-the-art molecular optimization methods on two benchmark tasks of simultaneously optimizing multiple non-biological activity properties while satisfying two structural constraints. Furthermore, the practical applicability of CMOMO is verified on two practical tasks, where it identified a collection of candidate ligands of 2-adrenoceptor GPCR and candidate inhibitors of glycogen synthase kinase-3 with high properties and under drug-like constraints.

Paper Structure

This paper contains 20 sections, 7 equations, 9 figures.

Figures (9)

  • Figure 1: Three problem models for molecular optimization and the schematic diagram of CMOMO. (A) Single-objective optimization aims to find a molecule with the best objective value. Multi-objective optimization aims to search for a set of trade-off molecules among multiple properties in the Pareto front (PF). Constrained multi-objective optimization aims to identify a set of molecules in the constrained Pareto front (CPF) that trade off multiple properties and meet drug-like constraints. (B) The optimization process of CMOMO framework. CMOMO firstly focuses on property optimization to find molecules positioned on the PF, and then discovers the molecules on the CPF to solve the constrained multi-objective molecular optimization problem.
  • Figure 2: The illustrative diagram of CMOMO. (A) To begin with, CMOMO generates an initial population with $P$ molecules for a lead molecule. Then, CMOMO performs the dynamic cooperative optimization. Finally, CMOMO achieves a set of feasible molecules (with desired molecular properties and under the drug-like constraints). (B) The vector fragmentation based evolutionary reproduction strategy (VFER). The VFER strategy is employed to generate promising offspring molecules through linear crossover and fragmentation-based mutation operations. (C) The ranking aggregation strategy. This strategy dynamically aggregates the rankings of molecules that ordered by properties and constraints, respectively.
  • Figure 3: Performance of CMOMO and comparison methods on two benchmark constrained multi-objective optimization tasks. (A) Success rate (SR) of CMOMO and comparison methods on Task 1. (B) SR of CMOMO and comparison methods on Task 2. (C) Hypervolume (HV) of CMOMO and three Pareto optimization methods on Task 1. (D) HV of CMOMO and three Pareto optimization methods on Task 2. (E) The number of successfully optimized molecules obtained by four multi-objective optimization methods on Task 1. (F) The number of successfully optimized molecules obtained by four multi-objective optimization methods on Task 2.
  • Figure 4: Performance of CMOMO and comparison methods on Task 3. (A) Success rate (SR) of CMOMO and comparison methods on Task 3. (B) Hypervolume (HV) of CMOMO and three Pareto optimization methods on Task 3. (C) The optimization results obtained by CMOMO on five test instances, where the molecules at the top and bottom rows denote the lead and optimized molecules, respectively. (D) The docking pose, QED, binding energy, and Similarity of two optimized molecules for the 4LDE protein are shown. Both molecules exhibit the desired QED, binding energy, and Similarity. The right two sub-figures show the interactions between the optimized molecules and the amino acid residues in protein binding pockets.
  • Figure 5: Performance of CMOMO and comparison methods on Task 4. (A) Success rate (SR) of CMOMO and comparison methods on Task 4. (B) Hypervolume (HV) of CMOMO and three Pareto optimization methods on Task 4. (C) The optimization results obtained by CMOMO on five test instances, where the molecules at the top and bottom rows denote the lead and optimized molecules, respectively. (D) The docking poses, QED, binding energy, SA, and Similarity of two optimized molecules for the GSK3$\beta$ target are shown. Both molecules exhibit the desired QED, binding energy, SA, and Similarity. The right two sub-figures show the interactions between the optimized molecules and the amino acid residues in protein binding pockets.
  • ...and 4 more figures