MaterialsGalaxy: A Platform Fusing Experimental and Theoretical Data in Condensed Matter Physics
Tiannian Zhu, Zhong Fang, Quansheng Wu, Hongming Weng
TL;DR
MaterialsGalaxy presents a structure similarity-driven platform that bridges experimental and theoretical data in condensed matter physics by transforming crystal structures into fingerprints and indexing them for fast vector-based fusion. The system standardizes heterogeneous data, links records via near-real-time similarity searches, and enriches material profiles with direct and analog information, augmented by AI tools for knowledge extraction, structure prediction, and property forecasting. Key contributions include a robust data standardization pipeline, a scalable structure-driven fusion engine, and demonstrated utility through CrGeTe3 and additional materials, supported by a public API and FAIR-aligned data access. This work enables a data-driven materials discovery paradigm that accelerates hypothesis generation, synthesis guidance, and cross-modal insights by tightly integrating experiment, theory, and AI within a unified platform.
Abstract
Modern materials science generates vast and diverse datasets from both experiments and computations, yet these multi-source, heterogeneous data often remain disconnected in isolated "silos". Here, we introduce MaterialsGalaxy, a comprehensive platform that deeply fuses experimental and theoretical data in condensed matter physics. Its core innovation is a structure similarity-driven data fusion mechanism that quantitatively links cross-modal records - spanning diffraction, crystal growth, computations, and literature - based on their underlying atomic structures. The platform integrates artificial intelligence (AI) tools, including large language models (LLMs) for knowledge extraction, generative models for crystal structure prediction, and machine learning property predictors, to enhance data interpretation and accelerate materials discovery. We demonstrate that MaterialsGalaxy effectively integrates these disparate data sources, uncovering hidden correlations and guiding the design of novel materials. By bridging the long-standing gap between experiment and theory, MaterialsGalaxy provides a new paradigm for data-driven materials research and accelerates the discovery of advanced materials.
