Table of Contents
Fetching ...

Multimodal Search in Chemical Documents and Reactions

Ayush Kumar Shah, Abhisek Dey, Leo Luo, Bryan Amador, Patrick Philippy, Ming Zhong, Siru Ouyang, David Mark Friday, David Bianchi, Nick Jackson, Richard Zanibbi, Jiawei Han

TL;DR

This work tackles the fragmented retrieval of chemical knowledge by introducing a multimodal search system that directly links molecular diagrams, text passages, and extracted reaction data. It combines ReactionMiner-based reaction extraction, SMILES generation from text and diagrams, and diagram parsing with RDKit-based structure search, all backed by BM25 text retrieval and a multimodal ranking fusion. The approach enables passage-level access with linked diagrams and reaction contexts, supporting text, structure, and Reaction SMARTS queries, including dedicated reaction navigation within documents. Expert evaluation on Suzuki coupling literature demonstrates practical utility and identifies avenues for metadata enrichment, ranking transparency, and diagram-text linking improvements, with future work aimed at scaling and enhancing cross-modal representations.

Abstract

We present a multimodal search tool that facilitates retrieval of chemical reactions, molecular structures, and associated text from scientific literature. Queries may combine molecular diagrams, textual descriptions, and reaction data, allowing users to connect different representations of chemical information. To support this, the indexing process includes chemical diagram extraction and parsing, extraction of reaction data from text in tabular form, and cross-modal linking of diagrams and their mentions in text. We describe the system's architecture, key functionalities, and retrieval process, along with expert assessments of the system. This demo highlights the workflow and technical components of the search system.

Multimodal Search in Chemical Documents and Reactions

TL;DR

This work tackles the fragmented retrieval of chemical knowledge by introducing a multimodal search system that directly links molecular diagrams, text passages, and extracted reaction data. It combines ReactionMiner-based reaction extraction, SMILES generation from text and diagrams, and diagram parsing with RDKit-based structure search, all backed by BM25 text retrieval and a multimodal ranking fusion. The approach enables passage-level access with linked diagrams and reaction contexts, supporting text, structure, and Reaction SMARTS queries, including dedicated reaction navigation within documents. Expert evaluation on Suzuki coupling literature demonstrates practical utility and identifies avenues for metadata enrichment, ranking transparency, and diagram-text linking improvements, with future work aimed at scaling and enhancing cross-modal representations.

Abstract

We present a multimodal search tool that facilitates retrieval of chemical reactions, molecular structures, and associated text from scientific literature. Queries may combine molecular diagrams, textual descriptions, and reaction data, allowing users to connect different representations of chemical information. To support this, the indexing process includes chemical diagram extraction and parsing, extraction of reaction data from text in tabular form, and cross-modal linking of diagrams and their mentions in text. We describe the system's architecture, key functionalities, and retrieval process, along with expert assessments of the system. This demo highlights the workflow and technical components of the search system.

Paper Structure

This paper contains 5 sections, 2 figures.

Figures (2)

  • Figure 1: Text/diagram compound extraction and compound--passage linking. Two passage types are shown: (1) reaction passages from ReactionMiner (2 boxes) and (2) a single text passage containing both extracted compounds. Highlighted text denotes extracted chemical entities: pink for mentions in molecular diagrams, yellow for unmatched mentions. Matches come from (1) Text matching via Levenshtein distance and (2) SMILES matching via Tanimoto Similarity. Highlight colors (e.g., orange and blue) indicate molecules & reaction text linked to the same reaction passage.
  • Figure 2: Multi-modal search results for a text and Reaction SMARTS query. Results are organized by document, with matched passages linked to extracted reactions, molecular structures, and highlighted text mentions. Key reaction details, including reactant ('98') and product ('99') in both text and diagrams, along with their predicted SMILES representations, are displayed. Users can navigate directly to relevant sections within each document, with highlighted passages indicating the corresponding matches.