Table of Contents
Fetching ...

Solar-GECO: Perovskite Solar Cell Property Prediction with Geometric-Aware Co-Attention

Lucas Li, Jean-Baptiste Puel, Florence Carton, Dounya Barrit, Jhony H. Giraldo

TL;DR

Solar-GECO tackles the multiscale design challenge of perovskite solar cells by integrating atomic-structure information with device-layer context through a geometric GNN and a language-model-based encoder, fused via multi-layer co-attention. The approach yields accurate PCE predictions and well-calibrated uncertainty, surpassing state-of-the-art baselines that rely only on text or composition. By explicitly modeling intra-layer properties and inter-layer interactions, Solar-GECO enables more reliable device-level screening in a vast design space. The work paves the way for accelerated discovery of high-performance perovskite devices by bridging crystal geometry with device architecture.

Abstract

Perovskite solar cells are promising candidates for next-generation photovoltaics. However, their performance as multi-scale devices is determined by complex interactions between their constituent layers. This creates a vast combinatorial space of possible materials and device architectures, making the conventional experimental-based screening process slow and expensive. Machine learning models try to address this problem, but they only focus on individual material properties or neglect the important geometric information of the perovskite crystal. To address this problem, we propose to predict perovskite solar cell power conversion efficiency with a geometric-aware co-attention (Solar-GECO) model. Solar-GECO combines a geometric graph neural network (GNN) - that directly encodes the atomic structure of the perovskite absorber - with language model embeddings that process the textual strings representing the chemical compounds of the transport layers and other device components. Solar-GECO also integrates a co-attention module to capture intra-layer dependencies and inter-layer interactions, while a probabilistic regression head predicts both power conversion efficiency (PCE) and its associated uncertainty. Solar-GECO achieves state-of-the-art performance, significantly outperforming several baselines, reducing the mean absolute error (MAE) for PCE prediction from 3.066 to 2.936 compared to semantic GNN (the previous state-of-the-art model). Solar-GECO demonstrates that integrating geometric and textual information provides a more powerful and accurate framework for PCE prediction.

Solar-GECO: Perovskite Solar Cell Property Prediction with Geometric-Aware Co-Attention

TL;DR

Solar-GECO tackles the multiscale design challenge of perovskite solar cells by integrating atomic-structure information with device-layer context through a geometric GNN and a language-model-based encoder, fused via multi-layer co-attention. The approach yields accurate PCE predictions and well-calibrated uncertainty, surpassing state-of-the-art baselines that rely only on text or composition. By explicitly modeling intra-layer properties and inter-layer interactions, Solar-GECO enables more reliable device-level screening in a vast design space. The work paves the way for accelerated discovery of high-performance perovskite devices by bridging crystal geometry with device architecture.

Abstract

Perovskite solar cells are promising candidates for next-generation photovoltaics. However, their performance as multi-scale devices is determined by complex interactions between their constituent layers. This creates a vast combinatorial space of possible materials and device architectures, making the conventional experimental-based screening process slow and expensive. Machine learning models try to address this problem, but they only focus on individual material properties or neglect the important geometric information of the perovskite crystal. To address this problem, we propose to predict perovskite solar cell power conversion efficiency with a geometric-aware co-attention (Solar-GECO) model. Solar-GECO combines a geometric graph neural network (GNN) - that directly encodes the atomic structure of the perovskite absorber - with language model embeddings that process the textual strings representing the chemical compounds of the transport layers and other device components. Solar-GECO also integrates a co-attention module to capture intra-layer dependencies and inter-layer interactions, while a probabilistic regression head predicts both power conversion efficiency (PCE) and its associated uncertainty. Solar-GECO achieves state-of-the-art performance, significantly outperforming several baselines, reducing the mean absolute error (MAE) for PCE prediction from 3.066 to 2.936 compared to semantic GNN (the previous state-of-the-art model). Solar-GECO demonstrates that integrating geometric and textual information provides a more powerful and accurate framework for PCE prediction.

Paper Structure

This paper contains 17 sections, 14 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Our model integrates the atomic structure of the perovskite absorber (left) with the device context of stacked layers (center) through a co-attention module. The fused representation is used to predict the power conversion efficiency (PCE) of the device and its associated uncertainty (right).
  • Figure 2: Overview of Solar-GECO. Our model combines a geometric GNN encoder for the perovskite absorber’s 3D atomic structure with a device text encoder (LLM) for chemical descriptions of the substrate, ETL, HTL, and back contact. Their outputs are fused in a co-attention module with self- and cross-attention layers to model intra-layer dependencies and inter-layer interactions. Pooled features are concatenated and passed to a probabilistic regression head that predicts the mean and variance of PCE, trained with a Gaussian negative log-likelihood loss.
  • Figure 2: Training hyperparameters for the Solar-GECO model.
  • Figure 3: Plot of predicted PCE versus true PCE for the original model on the test set. The diagonal line represents perfect prediction.
  • Figure 4: Calibration plot: quantile-binned mean $\overline{|e|}$ vs. $\overline{\sigma}$ with $95\%$ confidence intervals (CIs) for the mean, overlaid with the theoretical line $c\,\sigma$ , $c=\sqrt{2/\pi}$.