Table of Contents
Fetching ...

RFL: Simplifying Chemical Structure Recognition with Ring-Free Language

Qikai Chang, Mingjun Chen, Changpeng Pi, Pengfei Hu, Zhenrong Zhang, Jiefeng Ma, Jun Du, Baocai Yin, Jinshui Hu

TL;DR

The paper tackles optical chemical structure recognition (OCSR) for two-dimensional molecules with rings and branches by introducing Ring-Free Language (RFL), which decouples a molecule $G$ into a molecular skeleton $\mathcal{S}$, ring structures $\mathcal{R}$, and branch information $\mathcal{F}$, utilizing SuperAtom/SuperBond abstractions. Building on this, the authors propose a universal Molecular Skeleton Decoder (MSD) that hierarchically predicts $\mathcal{S}$ and $\mathcal{R}$ and then restores the full structure using $\mathcal{F}$ via a Skeleton Generation Module and a Branch Classification Module. The approach yields consistent improvements over state-of-the-art baselines on handwritten (EDU-CHEMC) and printed (Mini-CASIA-CSDB) datasets, with good generalization and only modest computational overhead. This ring-aware, divide-and-conquer framework holds promise for broader structured diagram understanding beyond chemistry and provides a pathway for more robust end-to-end OCSR."

Abstract

The primary objective of Optical Chemical Structure Recognition is to identify chemical structure images into corresponding markup sequences. However, the complex two-dimensional structures of molecules, particularly those with rings and multiple branches, present significant challenges for current end-to-end methods to learn one-dimensional markup directly. To overcome this limitation, we propose a novel Ring-Free Language (RFL), which utilizes a divide-and-conquer strategy to describe chemical structures in a hierarchical form. RFL allows complex molecular structures to be decomposed into multiple parts, ensuring both uniqueness and conciseness while enhancing readability. This approach significantly reduces the learning difficulty for recognition models. Leveraging RFL, we propose a universal Molecular Skeleton Decoder (MSD), which comprises a skeleton generation module that progressively predicts the molecular skeleton and individual rings, along with a branch classification module for predicting branch information. Experimental results demonstrate that the proposed RFL and MSD can be applied to various mainstream methods, achieving superior performance compared to state-of-the-art approaches in both printed and handwritten scenarios. The code is available at https://github.com/JingMog/RFL-MSD.

RFL: Simplifying Chemical Structure Recognition with Ring-Free Language

TL;DR

The paper tackles optical chemical structure recognition (OCSR) for two-dimensional molecules with rings and branches by introducing Ring-Free Language (RFL), which decouples a molecule into a molecular skeleton , ring structures , and branch information , utilizing SuperAtom/SuperBond abstractions. Building on this, the authors propose a universal Molecular Skeleton Decoder (MSD) that hierarchically predicts and and then restores the full structure using via a Skeleton Generation Module and a Branch Classification Module. The approach yields consistent improvements over state-of-the-art baselines on handwritten (EDU-CHEMC) and printed (Mini-CASIA-CSDB) datasets, with good generalization and only modest computational overhead. This ring-aware, divide-and-conquer framework holds promise for broader structured diagram understanding beyond chemistry and provides a pathway for more robust end-to-end OCSR."

Abstract

The primary objective of Optical Chemical Structure Recognition is to identify chemical structure images into corresponding markup sequences. However, the complex two-dimensional structures of molecules, particularly those with rings and multiple branches, present significant challenges for current end-to-end methods to learn one-dimensional markup directly. To overcome this limitation, we propose a novel Ring-Free Language (RFL), which utilizes a divide-and-conquer strategy to describe chemical structures in a hierarchical form. RFL allows complex molecular structures to be decomposed into multiple parts, ensuring both uniqueness and conciseness while enhancing readability. This approach significantly reduces the learning difficulty for recognition models. Leveraging RFL, we propose a universal Molecular Skeleton Decoder (MSD), which comprises a skeleton generation module that progressively predicts the molecular skeleton and individual rings, along with a branch classification module for predicting branch information. Experimental results demonstrate that the proposed RFL and MSD can be applied to various mainstream methods, achieving superior performance compared to state-of-the-art approaches in both printed and handwritten scenarios. The code is available at https://github.com/JingMog/RFL-MSD.

Paper Structure

This paper contains 18 sections, 12 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Comparison of Ring-Free Language with previous modeling method. Previous methods use depth-first search to establish one-dimension markup, where spatial structures are implicitly expressed. Our RFL significantly simplifies complex spatial structures by decoupling them into molecular skeleton $\mathcal{S}$, ring structures $\mathcal{R}$ and branch information $\mathcal{F}$.
  • Figure 2: The step-by-step decoupling process of Ring-Free Language, including the stages of Splitting and Restoring.
  • Figure 3: The architecture of our method. First, the molecular image is input into the CNN Encoder to extract deep features. Subsequently, the molecular skeleton decoder autoregressively decodes the skeleton $\mathcal{R}$ and rings $\mathcal{S}$. During this process, the node features of skeleton and ring bonds are sent to the Branch Classification Module to derive branch information $\mathcal{F}$, which is used to restore the predicted molecular structure. The common token refers to tokens in RFL that are not specially processed.
  • Figure 4: A qualitative comparison of our proposed method with the SOTA method in the handwritten dataset EDU-CHEMC (a,b,c) and the printed dataset Mini-CASIA-CSDB (d). Compared to RCGD hu2023handwritten, our approach robustly recognizes molecular structures in challenging complex ring structures.
  • Figure 5: Illustration of chemical molecules with increased structural complexity. The number in the top left indicates the structural complexity level of each molecule, with the corresponding complexity range.
  • ...and 1 more figures