Building Trustworthy AI for Materials Discovery: From Autonomous Laboratories to Z-scores

Benhour Amirian; Ashley S. Dale; Sergei Kalinin; Jason Hattrick-Simpers

Building Trustworthy AI for Materials Discovery: From Autonomous Laboratories to Z-scores

Benhour Amirian, Ashley S. Dale, Sergei Kalinin, Jason Hattrick-Simpers

TL;DR

The paper proposes the GIFTERS framework to evaluate trustworthiness in AI-driven materials discovery, connecting generalizability, interpretability, fairness, transparency, explainability, robustness, and stability with uncertainty quantification. Through a literature review of 63 studies, it shows most work addresses only a subset of GIFTERS, with generalizability commonly reported but transparency and fairness often lacking. It also analyzes Bayesian and non-Bayesian approaches, highlighting gaps and proposing cross-domain methods from healthcare, climate science, and NLP to improve trust. Finally, it outlines future directions including physics-informed learning, human-in-the-loop governance, and robust evaluation practices to ensure AI accelerates discovery while meeting community norms.

Abstract

Accelerated material discovery increasingly relies on artificial intelligence and machine learning, collectively termed "AI/ML". A key challenge in using AI is ensuring that human scientists trust the models are valid and reliable. Accordingly, we define a trustworthy AI framework GIFTERS for materials science and discovery to evaluate whether reported machine learning methods are generalizable, interpretable, fair, transparent, explainable, robust, and stable. Through a critical literature review, we highlight that these are the trustworthiness principles most valued by the materials discovery community. However, we also find that comprehensive approaches to trustworthiness are rarely reported; this is quantified by a median GIFTERS score of 5/7. We observe that Bayesian studies frequently omit fair data practices, while non-Bayesian studies most frequently omit interpretability. Finally, we identify approaches for improving trustworthiness methods in artificial intelligence and machine learning for materials science by considering work accomplished in other scientific disciplines such as healthcare, climate science, and natural language processing with an emphasis on methods that may transfer to materials discovery experiments. By combining these observations, we highlight the necessity of human-in-the-loop, and integrated approaches to bridge the gap between trustworthiness and uncertainty quantification for future directions of materials science research. This ensures that AI/ML methods not only accelerate discovery, but also meet ethical and scientific norms established by the materials discovery community. This work provides a road map for developing trustworthy artificial intelligence systems that will accurately and confidently enable material discovery.

Building Trustworthy AI for Materials Discovery: From Autonomous Laboratories to Z-scores

TL;DR

Abstract

Building Trustworthy AI for Materials Discovery: From Autonomous Laboratories to Z-scores

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)