Table of Contents
Fetching ...

M-CODE: Materials Categorization via Ontology, Dimensionality and Evolution

Vsevolod Biryukov, Kamal Choudhary, Timur Bazhirov

TL;DR

M-CODE - Materials Categorization via Ontology, Dimensionality and Evolution - a compact categorization system that links materials-science-specific terminology to a set of reusable concepts as building blocks and provenance-aware transformations to support reproducible dataset generation, validation, and community contributions.

Abstract

The rapid advancement of artificial intelligence in materials science requires data standards and data management practices that can capture the complexity of real-world structures, including surfaces, interfaces, defects, and dimensionality reduction. We present M-CODE - Materials Categorization via Ontology, Dimensionality and Evolution - a compact categorization system that links materials-science-specific terminology to a set of reusable concepts as building blocks and provenance-aware transformations. M-CODE classifies structures by dimensionality, structural complexity (from pristine to compound pristine, defective, and processed), and variants that capture common structure creation and evolution approaches. A practical implementation of the categorization is provided in an open-source codebase that includes JSON schemas, examples, and Python and TypeScript types/interfaces, designed to support reproducible dataset generation, validation, and community contributions.

M-CODE: Materials Categorization via Ontology, Dimensionality and Evolution

TL;DR

M-CODE - Materials Categorization via Ontology, Dimensionality and Evolution - a compact categorization system that links materials-science-specific terminology to a set of reusable concepts as building blocks and provenance-aware transformations to support reproducible dataset generation, validation, and community contributions.

Abstract

The rapid advancement of artificial intelligence in materials science requires data standards and data management practices that can capture the complexity of real-world structures, including surfaces, interfaces, defects, and dimensionality reduction. We present M-CODE - Materials Categorization via Ontology, Dimensionality and Evolution - a compact categorization system that links materials-science-specific terminology to a set of reusable concepts as building blocks and provenance-aware transformations. M-CODE classifies structures by dimensionality, structural complexity (from pristine to compound pristine, defective, and processed), and variants that capture common structure creation and evolution approaches. A practical implementation of the categorization is provided in an open-source codebase that includes JSON schemas, examples, and Python and TypeScript types/interfaces, designed to support reproducible dataset generation, validation, and community contributions.
Paper Structure (23 sections, 1 figure, 5 tables)

This paper contains 23 sections, 1 figure, 5 tables.

Figures (1)

  • Figure 2: Representative target structure classes grouped by domain and annotated with the corresponding M-CODE tags. See Table \ref{['tab:categorization-science']}. Pristine Structures (top row): MoS$_2$ monolayer, SrTiO$_3$ slab, Si nanowire, and graphene nanoribbon. Compound Pristine Structures (second row): Si/SiO$_2$/HfO$_2$/TiN heterostack, Cu/SiO$_2$ interface (ZSL-matched), twisted MoS$_2$/WS$_2$ interface, and hBN/hBN nanoribbon interface. Defective Structures (third row): Pt adatom on WS$_2$, island on TiN(001), Cu(321)/($\bar{3}\bar{2}\bar{1}$) grain boundary, and nitrogen substitution in graphene. Processed Structures (bottom row): perturbed graphene, passivated Si surface, passivated Si nanowire edge, and Si box cutout.