From Literature to Lab: Closed-Loop Advancement of Perovskite Solar Cells via Domain Knowledge Guided LLM
Penglei Sun, Shuyan Chen, Xiang Liu, Longhan Zhang, Huajie You, Chang Yan, Yongqi Zhang, Xiaowen Chu, Tong-yi Zhang
TL;DR
This work introduces PVK-LLM, a domain-knowledge–guided LLM tailored for perovskite solar cell research, and couples it with PVK-BO, a Bayesian optimization loop that leverages domain knowledge and experimental feedback to navigate a high-dimensional material-design space. The framework is trained via a three-stage curriculum (PVK-Sci, PVK-Cite, PVK-Exp) and grounded through PVK-KG, enabling retrieval-augmented generation and robust knowledge grounding. Benchmark results show state-of-the-art domain understanding (PVK-MCQ accuracy $87.25\%$) and superior QA performance, while simulator and wet-lab experiments demonstrate accelerated closed-loop optimization, achieving a champion PCE of $26.00\%$ in a four-component passivation system. The work demonstrates that integrating structured domain knowledge with LLMs can rapidly advance real-world PSC development and offers a scalable blueprint for autonomous experimentation in other high-dimensional material systems.
Abstract
Perovskite solar cells (PSCs) have been considered as a next-generation disruptive photovoltaic technology, yet their advancement is constrained by the complexity of perovskite recipe with high-dimensional material and process design space. Despite the impressive general reasoning of Large Language Models (LLMs), they struggle with two limitations for application in PSCs: an inability to align general semantics with the perovskite domain knowledge, and an inefficiency in navigating high-dimensional perovskite material and recipe design spaces. To address these limitations, we introduce a domain-knowledge-guided framework PVK-LLM, a specialized model to serve as an expert to bridge general semantics with perovskite domain knowledge. By integrating this domain knowledge into a hierarchical Bayesian Optimization workflow, our approach efficiently navigates the high-dimension design space on a solar cell simulator platform. The domain knowledge resolves cold-start problems while dynamically adapting to simulator feedback. Moreover, in an individual wet-lab experiment aimed at maximizing power conversion efficiency (PCE), our framework autonomously proposes a novel synergistic four-component recipe comprising specialized organic passivation recipe (3MTPAI, PDAI2, EDAI2, and PipDI) which has not been reported in existing literature. This AI-designed recipe effectively achieves a champion PCE value of over 26.0 %, approaching world records achieved through extensive expert trial-and-error. Our approach can effectively enable LLM comprehend the domain knowledge, which can efficiently navigate in a high-dimensional, capable to accelerate the advancement in real-world perovskite as well as other material science development.
