Quotient complex (QC)-based machine learning for 2D perovskite design
Chuan-Shen Hu, Rishikanta Mayengbam, Kelin Xia, Tze Chien Sum
TL;DR
The paper addresses the challenge of representing 2D perovskites by capturing both higher-order interactions and periodicity. It introduces a quotient-complex (QC) framework, defining $\overline{K} = K/\sim_V$ and constructing a homotopy-equivalent model $\widetilde{K}$ so that $H_q(\overline{K}) \cong H_q(\widetilde{K})$ for $q>1$, with corresponding relations for $q=0,1$. Persistent homology is then applied to QC filtrations to yield QC-based descriptors (QCDs) from persistence barcodes ${\rm PB}_q(\overline{K_\bullet})$, including dimension-0 and dimension-1 components and element-specific variants. A gradient-boosted tree model (QC-GBT) using these QCDs, tested on the NMSE 2D perovskite dataset, delivers competitive to superior bandgap predictions compared with state-of-the-art SOAP and GNN approaches, illustrating the critical role of periodicity information in material functionality. The work provides open data and code, highlighting the practical impact for accelerated design of 2D perovskites and broader periodic materials.
Abstract
With remarkable stability and exceptional optoelectronic properties, two-dimensional (2D) halide layered perovskites hold immense promise for revolutionizing photovoltaic technology. Presently, inadequate representations have substantially impeded the design and discovery of 2D perovskites. In this context, we introduce a novel computational topology framework termed the quotient complex (QC), which serves as the foundation for the material representation. Our QC-based features are seamlessly integrated with learning models for the advancement of 2D perovskite design. At the heart of this framework lies the quotient complex descriptors (QCDs), representing a quotient variation of simplicial complexes derived from materials unit cell and periodic boundary conditions. Differing from prior material representations, this approach encodes higher-order interactions and periodicity information simultaneously. Based on the well-established New Materials for Solar Energetics (NMSE) databank, our QC-based machine learning models exhibit superior performance against all existing counterparts. This underscores the paramount role of periodicity information in predicting material functionality, while also showcasing the remarkable efficiency of the QC-based model in characterizing materials structural attributes.
