Energy-GNoME: A Living Database of Selected Materials for Energy Applications
Paolo De Angelis, Giovanni Trezza, Giulio Barletta, Pietro Asinari, Eliodoro Chiavazzo
TL;DR
This work presents Energy-GNoME, an AI-driven, living database framework that mines the expansive GNoME material space for energy-related compounds by integrating a specialized energy subset $M^E$ with a general-purpose MP database and the unexplored GNoME set $G$. A committee of AI-experts (classifiers) defines the energy-material region, while regressor ensembles predict key properties ($zT$, $E_g$, $ΔV_c$) for screened candidates, enabling efficient, bias-aware screening and continuous database growth through active learning. Across three case studies—thermoelectrics, perovskites, and battery cathodes—the protocol yields thousands of promising candidates (e.g., 7,530 thermoelectrics, 4,259 perovskites, 21,243 cathodes) and demonstrates robust predictive performance ($R^2$ in the ~0.7 range for regressors; AUC ~ 0.98 for AI-experts). The approach addresses extrapolation biases, accelerates materials discovery, and lays the groundwork for expanding the Energy-GNoME space to include sustainability and toxicity considerations, making it a practical tool for experimental and computational exploration in energy materials.
Abstract
Artificial Intelligence (AI) in materials science is driving significant advancements in the discovery of advanced materials for energy applications. The recent GNoME protocol identifies over 380,000 novel stable crystals. From this, we identify over 33,000 materials with potential as energy materials forming the Energy-GNoME database. Leveraging Machine Learning (ML) and Deep Learning (DL) tools, our protocol mitigates cross-domain data bias using feature spaces to identify potential candidates for thermoelectric materials, novel battery cathodes, and novel perovskites. Classifiers with both structural and compositional features identify domains of applicability, where we expect enhanced accuracy of the regressors. Such regressors are trained to predict key materials properties like, thermoelectric figure of merit (zT), band gap (Eg), and cathode voltage ($ΔV_c$). This method significantly narrows the pool of potential candidates, serving as an efficient guide for experimental and computational chemistry investigations and accelerating the discovery of materials suited for electricity generation, energy storage and conversion.
