Table of Contents
Fetching ...

MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

Xiangsen Chen, Ruilong Wu, Yanyan Lan, Ting Ma, Yang Liu

Abstract

Despite deep learning's success in chemistry, its impact is hindered by a lack of interpretability and an inability to resolve activity cliffs, where minor structural nuances trigger drastic property shifts. Current representation learning, bound by the similarity principle, often fails to capture these structural-activity discontinuities. To address this, we introduce MolEvolve, an evolutionary framework that reformulates molecular discovery as an autonomous, look-ahead planning problem. Unlike traditional methods that depend on human-engineered features or rigid prior knowledge, MolEvolve leverages a Large Language Model (LLM) to actively explore and evolve a library of executable chemical symbolic operations. By utilizing the LLM to cold start and an Monte Carlo Tree Search (MCTS) engine for test-time planning with external tools (e.g. RDKit), the system self-discovers optimal trajectories autonomously. This process evolves transparent reasoning chains that translate complex structural transformations into actionable, human-readable chemical insights. Experimental results demonstrate that MolEvolve's autonomous search not only evolves transparent, human-readable chemical insights, but also outperforms baselines in both property prediction and molecule optimization tasks.

MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

Abstract

Despite deep learning's success in chemistry, its impact is hindered by a lack of interpretability and an inability to resolve activity cliffs, where minor structural nuances trigger drastic property shifts. Current representation learning, bound by the similarity principle, often fails to capture these structural-activity discontinuities. To address this, we introduce MolEvolve, an evolutionary framework that reformulates molecular discovery as an autonomous, look-ahead planning problem. Unlike traditional methods that depend on human-engineered features or rigid prior knowledge, MolEvolve leverages a Large Language Model (LLM) to actively explore and evolve a library of executable chemical symbolic operations. By utilizing the LLM to cold start and an Monte Carlo Tree Search (MCTS) engine for test-time planning with external tools (e.g. RDKit), the system self-discovers optimal trajectories autonomously. This process evolves transparent reasoning chains that translate complex structural transformations into actionable, human-readable chemical insights. Experimental results demonstrate that MolEvolve's autonomous search not only evolves transparent, human-readable chemical insights, but also outperforms baselines in both property prediction and molecule optimization tasks.

Paper Structure

This paper contains 20 sections, 1 theorem, 10 equations, 9 figures, 4 tables, 1 algorithm.

Key Result

Theorem 2.3

Let $x \in \mathcal{X}_{\text{cliff}}$ be a molecule at an activity cliff. Assume the model perfectly fits the training sample $x$, i.e., $f_{\theta}(x) = f^*(x)$. Then, there exists a neighbor $x' \in \mathcal{N}_1(x)$ such that the prediction error is strictly lower-bounded by the difference in Li

Figures (9)

  • Figure 1: (Up) Existing GNNs act as black boxes with post-hoc interpretations, while LLMs suffer from numerical hallucinations on semantic manifolds. (Down) The representation-precision paradox: minor structural distances lead to "activity cliffs" in molecular manifolds, which LLMs fail to capture precisely due to semantic smoothing.
  • Figure 2: The overall architecture of MolEvolve. Phase 1 (Cold Start) distills domain knowledge into executable heuristic rules via symbolic grounding and self-correction. Phase 2 (LLM-MCTS) utilizes these rules to initialize an evolutionary search tree, where the LLM acts as molecule operator to guide selection and expansion within a rigorous verification loop.
  • Figure 3: Search Efficiency and Model Adaptability Analysis: Test RMSE trajectories across 100 iterations. To verify that the evolved symbolic features are model-agnostic, we employ three distinct primary models as downstream evaluators. In each iteration, the MCTS engine generates a candidate feature set, which is then frozen to train these evaluators for scoring.
  • Figure 4: An Example of Evolutionary Trajectory of Molecular Optimization(logP).
  • Figure 5: Hyperparameter Sensitivity Analysis Heatmaps. We visualize the Average Improvement (Left) and Success Rate (Right) for LogP (Top) and QED (Bottom) tasks. Darker colors indicate superior performance.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Definition 2.1: Discrete Local Lipschitz Constant
  • Theorem 2.3: Lower Bound of Worst-Case Error
  • proof