Table of Contents
Fetching ...

optimade-maker: Automated generation of interoperable materials APIs from static data

Kristjan Eimre, Matthew L. Evans, Bud Macaulay, Xing Wang, Jusong Yu, Nicola Marzari, Gian-Marco Rignanese, Giovanni Pizzi

Abstract

Atomistic structural data are central to materials science, condensed matter physics, and chemistry, and are increasingly digitised across diverse repositories and databases. Interoperable access to these heterogeneous data sources enables reusable clients and tools, and is essential for cross-database analyses and data-driven materials discovery. Toward this aim, the OPTIMADE (Open Databases Integration for Materials Design) specification defines a standard REST API for atomistic structures and related properties. However, deploying and maintaining compliant services remains technically demanding and poses a significant barrier for many data providers. Here, we present optimade-maker, a lightweight toolkit for the automated generation of OPTIMADE-compliant APIs directly from raw atomistic structure and property data. The toolkit supports a wide range of raw datasets, enables conversion to a standardised OPTIMADE data representation, and allows for rapid deployment of APIs in both local and production environments. We further demonstrate it through an automated service on the Materials Cloud Archive, which automatically creates and publishes OPTIMADE APIs for contributed datasets, enabling immediate discoverability and interoperability. In addition, we implement data transformation pipelines for the Cambridge Structural Database (CSD) and the Inorganic Crystal Structure Database (ICSD), enabling unified access to these curated resources through the OPTIMADE framework. By lowering the technical barriers to interoperable data publication, optimade-maker represents an important step toward a scalable, FAIR materials data ecosystem integrating both community-contributed and curated databases.

optimade-maker: Automated generation of interoperable materials APIs from static data

Abstract

Atomistic structural data are central to materials science, condensed matter physics, and chemistry, and are increasingly digitised across diverse repositories and databases. Interoperable access to these heterogeneous data sources enables reusable clients and tools, and is essential for cross-database analyses and data-driven materials discovery. Toward this aim, the OPTIMADE (Open Databases Integration for Materials Design) specification defines a standard REST API for atomistic structures and related properties. However, deploying and maintaining compliant services remains technically demanding and poses a significant barrier for many data providers. Here, we present optimade-maker, a lightweight toolkit for the automated generation of OPTIMADE-compliant APIs directly from raw atomistic structure and property data. The toolkit supports a wide range of raw datasets, enables conversion to a standardised OPTIMADE data representation, and allows for rapid deployment of APIs in both local and production environments. We further demonstrate it through an automated service on the Materials Cloud Archive, which automatically creates and publishes OPTIMADE APIs for contributed datasets, enabling immediate discoverability and interoperability. In addition, we implement data transformation pipelines for the Cambridge Structural Database (CSD) and the Inorganic Crystal Structure Database (ICSD), enabling unified access to these curated resources through the OPTIMADE framework. By lowering the technical barriers to interoperable data publication, optimade-maker represents an important step toward a scalable, FAIR materials data ecosystem integrating both community-contributed and curated databases.
Paper Structure (12 sections, 5 figures)

This paper contains 12 sections, 5 figures.

Figures (5)

  • Figure 1: Schematic illustrating the context of the optimade-maker toolkit within the OPTIMADE ecosystem. Green boxes indicate already established entities. The red box highlights raw materials data that are not readily integrable into the ecosystem, a gap addressed by optimade-maker.
  • Figure 2: Schematic overview of the main components of the optimade-maker CLI. Raw data files are supplemented by the optimade.yaml configuration file, describing file locations and property definitions. Black boxes show the two primary CLI commands: convert transforms the raw data into the standard JSON Lines format, while serve launches an OPTIMADE API server. The server can be queried by any standard OPTIMADE HTTP requests, as shown in the example at the bottom (gray segment denotes the base URL, and blue segment represents a filter selecting binary structures).
  • Figure 3: Materials Cloud Archive OPTIMADE service. (a) Data pipeline utilizing the optimade-maker toolkit. (b) A webpage for a Materials Cloud Archive entry wang_machine_2023 containing an optimade.yaml file, where a link is displayed to directly explore the dataset with the new Materials Cloud OPTIMADE client.
  • Figure 4: optimade-maker integration with AiiDA. (a) A schematic representing an AiiDA provenance graph. Green rounded rectangles and orange rectangles represent data and process nodes, respectively. Each node contains a description, and its AiiDA type (bold). Labels on the arrows represent AiiDA edge labels. (b) The optimade-maker configuration file that allows to convert the AiiDA database into the OPTIMADE format, and serve it via the API.
  • Figure 5: Screenshot of the new Materials Cloud OPTIMADE Client. An OPTIMADE provider and database (here, a contributed dataset from the Materials Cloud Archive) are selected via dropdown menus. Materials are filtered by composition using an interactive periodic table and by structural properties using sliders. The inset shows the results section -- a 3D visualisation of a filtered structure.