Table of Contents
Fetching ...

SynthLens: Visual Analytics for Facilitating Multi-step Synthetic Route Design

Qipeng Wang, Rui Sheng, Shaolun Ruan, Xiaofu Jin, Chuhan Shi, Min Zhu

TL;DR

SynthLens presents a visual analytics system for multi-step synthetic route design by integrating literature-derived reactions into a tree-structured workflow. It combines a tree-form route overview, multiple views for paper exploration, molecule similarity, and weighted ranking to support sequential decision-making with both quantitative and qualitative criteria. The approach employs PubMedBERT embeddings for paper clustering and an Eunomia chemist AI agent for extracting reaction details, with a Context Relevancy metric to gauge extraction quality. Case studies and expert interviews demonstrate improved workflow efficiency and decision quality, suggesting broad applicability to retrosynthesis and other multi-criteria decision tasks in chemistry.

Abstract

Designing synthetic routes for novel molecules is pivotal in various fields like medicine and chemistry. In this process, researchers need to explore a set of synthetic reactions to transform starting molecules into intermediates step by step until the target novel molecule is obtained. However, designing synthetic routes presents challenges for researchers. First, researchers need to make decisions among numerous possible synthetic reactions at each step, considering various criteria (e.g., yield, experimental duration, and the count of experimental steps) to construct the synthetic route. Second, they must consider the potential impact of one choice at each step on the overall synthetic route. To address these challenges, we proposed SynthLens, a visual analytics system to facilitate the iterative construction of synthetic routes by exploring multiple possibilities for synthetic reactions at each step of construction. Specifically, we have introduced a tree-form visualization in SynthLens to compare and evaluate all the explored routes at various exploration steps, considering both the exploration step and multiple criteria. Our system empowers researchers to consider their construction process comprehensively, guiding them toward promising exploration directions to complete the synthetic route. We validated the usability and effectiveness of SynthLens through a quantitative evaluation and expert interviews, highlighting its role in facilitating the design process of synthetic routes. Finally, we discussed the insights of SynthLens to inspire other multi-criteria decision-making scenarios with visual analytics.

SynthLens: Visual Analytics for Facilitating Multi-step Synthetic Route Design

TL;DR

SynthLens presents a visual analytics system for multi-step synthetic route design by integrating literature-derived reactions into a tree-structured workflow. It combines a tree-form route overview, multiple views for paper exploration, molecule similarity, and weighted ranking to support sequential decision-making with both quantitative and qualitative criteria. The approach employs PubMedBERT embeddings for paper clustering and an Eunomia chemist AI agent for extracting reaction details, with a Context Relevancy metric to gauge extraction quality. Case studies and expert interviews demonstrate improved workflow efficiency and decision quality, suggesting broad applicability to retrosynthesis and other multi-criteria decision tasks in chemistry.

Abstract

Designing synthetic routes for novel molecules is pivotal in various fields like medicine and chemistry. In this process, researchers need to explore a set of synthetic reactions to transform starting molecules into intermediates step by step until the target novel molecule is obtained. However, designing synthetic routes presents challenges for researchers. First, researchers need to make decisions among numerous possible synthetic reactions at each step, considering various criteria (e.g., yield, experimental duration, and the count of experimental steps) to construct the synthetic route. Second, they must consider the potential impact of one choice at each step on the overall synthetic route. To address these challenges, we proposed SynthLens, a visual analytics system to facilitate the iterative construction of synthetic routes by exploring multiple possibilities for synthetic reactions at each step of construction. Specifically, we have introduced a tree-form visualization in SynthLens to compare and evaluate all the explored routes at various exploration steps, considering both the exploration step and multiple criteria. Our system empowers researchers to consider their construction process comprehensively, guiding them toward promising exploration directions to complete the synthetic route. We validated the usability and effectiveness of SynthLens through a quantitative evaluation and expert interviews, highlighting its role in facilitating the design process of synthetic routes. Finally, we discussed the insights of SynthLens to inspire other multi-criteria decision-making scenarios with visual analytics.

Paper Structure

This paper contains 30 sections, 8 figures, 1 table.

Figures (8)

  • Figure 1: A synthetic route from the starting molecule to the target molecule consists of several synthetic reactions, each involving specific reactants and products. The evaluation of a synthetic reaction contains criteria such as duration and experimental procedure.
  • Figure 2: The analyzing workflow of SynthLens: User Input: a user can define a starting molecule and expected synthetic reaction, then manually select one from retrieved papers for exploration. Information Extraction: the synthetic reaction details are then extracted automatically. Synthetic Route Construction: the user can choose to integrate the synthetic reaction into the synthetic route construction to form various decision sequences. Decision-making: finally, the user can select the optimal synthetic route from computed rankings supplemented by his own preferences. The whole process is the combination of automatic methods and user interaction.
  • Figure 3: The extracted synthetic details are presented in a table format in our system, consisting of several parts: raw materials, experimental operations, duration, and yield, etc. Moreover, our system allow users to modify the extracted details within a form.
  • Figure 4: SynthLens: (A) The Control Panel allows users to specify the starting molecule and potentially expected synthetic reactions before designing the synthetic routes. (B) The Paper Projection View presents the distribution of retrieved papers. (C) The Synthetic Reaction Detail shows the detail of extracted synthetic reactions. (D) The Synthetic Route Overview presents a tree-form visualization of the decision sequences of the synthetic routes. (E) The Similarity View shows the similarity in structure among specific molecules. (F) The Rank View visualizes the rank of decision sequences considering three factors with flexible weights. (G) The Experimental Procedure Comparison assists users in comparing the experimental procedures of multiple synthetic reactions.
  • Figure 5: The design of the node glyph that represents a synthetic reaction: (A) and (B) respectively represent the duration and total duration accumulated from the root to this node. (C) and (D) respectively represent the yield and total yield. The donut glyph (E) presents the annotated experimental procedures' difficulty of this synthetic reaction. Specifically, (${\rm E_1}$), (${\rm E_2}$), and (${\rm E_3}$) correspond to the difficulty in acquiring reactants, obtaining and utilizing experimental instrument, and the complexity of the experimental operations, respectively. The tooltip (F) displays some details about this synthetic reaction.
  • ...and 3 more figures