Table of Contents
Fetching ...

ChemRecon: a Consolidated Meta-Database Platform for Biochemical Data Integration

Casper Asbjørn Eriksen, Jakob Lykke Andersen, Rolf Fagerberg, Daniel Merkle

TL;DR

ChemRecon enables unified querying, cross-database analysis, and the construction of graph-based representations of sets of related database entries by the traversal of inter-database connections, which facilitates information extraction which is impossible within any single database.

Abstract

In this paper, we present ChemRecon, a meta-database and Python interface for integrating and exploring biochemical data across multiple heterogeneous resources by consolidating compounds, reactions, enzymes, molecular structures, and atom-to-atom maps from several major databases into a single, consistent ontology. ChemRecon enables unified querying, cross-database analysis, and the construction of graph-based representations of sets of related database entries by the traversal of inter-database connections. This facilitates information extraction which is impossible within any single database, including deriving consensus information from conflicting sources, of which identifying the most probable molecular structure associated with a given compound is just one example. The Python interface is available via pip from the Python Package Index (https://pypi.org/project/chemrecon/). ChemRecon is open-source and the source code is hosted at GitLab (https://gitlab.com/casbjorn/chemrecon). Documentation and additional information is available at https://chemrecon.org.

ChemRecon: a Consolidated Meta-Database Platform for Biochemical Data Integration

TL;DR

ChemRecon enables unified querying, cross-database analysis, and the construction of graph-based representations of sets of related database entries by the traversal of inter-database connections, which facilitates information extraction which is impossible within any single database.

Abstract

In this paper, we present ChemRecon, a meta-database and Python interface for integrating and exploring biochemical data across multiple heterogeneous resources by consolidating compounds, reactions, enzymes, molecular structures, and atom-to-atom maps from several major databases into a single, consistent ontology. ChemRecon enables unified querying, cross-database analysis, and the construction of graph-based representations of sets of related database entries by the traversal of inter-database connections. This facilitates information extraction which is impossible within any single database, including deriving consensus information from conflicting sources, of which identifying the most probable molecular structure associated with a given compound is just one example. The Python interface is available via pip from the Python Package Index (https://pypi.org/project/chemrecon/). ChemRecon is open-source and the source code is hosted at GitLab (https://gitlab.com/casbjorn/chemrecon). Documentation and additional information is available at https://chemrecon.org.
Paper Structure (8 sections, 1 figure, 1 table)

This paper contains 8 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: An excerpt of the entry graph created by ChemRecon using the example scripts in Sec. \ref{['sec:workflow']}. The BiGG entry 'citrate' is the initial vertex of the graph. The light blue vertices represent compound entries, while the turquoise vertices represent MolStructure entries, annotated with their confidence scores.