Advancing Extrapolative Predictions of Material Properties through Learning to Learn

Kohei Noda; Araki Wakiuchi; Yoshihiro Hayashi; Ryo Yoshida

Advancing Extrapolative Predictions of Material Properties through Learning to Learn

Kohei Noda, Araki Wakiuchi, Yoshihiro Hayashi, Ryo Yoshida

TL;DR

A meta-learning approach enhances the extrapolative generalization capabilities of neural networks, as demonstrated in predicting the properties of polymeric materials and hybrid organic-inorganic perovskites.

Abstract

Recent advancements in machine learning have showcased its potential to significantly accelerate the discovery of new materials. Central to this progress is the development of rapidly computable property predictors, enabling the identification of novel materials with desired properties from vast material spaces. However, the limited availability of data resources poses a significant challenge in data-driven materials research, particularly hindering the exploration of innovative materials beyond the boundaries of existing data. While machine learning predictors are inherently interpolative, establishing a general methodology to create an extrapolative predictor remains a fundamental challenge, limiting the search for innovative materials beyond existing data boundaries. In this study, we leverage an attention-based architecture of neural networks and meta-learning algorithms to acquire extrapolative generalization capability. The meta-learners, experienced repeatedly with arbitrarily generated extrapolative tasks, can acquire outstanding generalization capability in unexplored material spaces. Through the tasks of predicting the physical properties of polymeric materials and hybrid organic--inorganic perovskites, we highlight the potential of such extrapolatively trained models, particularly with their ability to rapidly adapt to unseen material domains in transfer learning scenarios.

Advancing Extrapolative Predictions of Material Properties through Learning to Learn

TL;DR

Abstract

Paper Structure (26 sections, 5 equations, 7 figures, 1 table)

This paper contains 26 sections, 5 equations, 7 figures, 1 table.

Figures (7)

Figure 1: Extrapolative episodic training (E2T) with MNNs involves generating numerous episodes from a given dataset, comprising a support set ($\mathcal{S}$) and an input-output pair $(x,y)$. By including a large number of $\mathcal{S}$ and $(x,y)$ with extrapolative relationships into the episode set, the trained MNN learns the general way $y = f(x, \mathcal{S})$ for predicting extrapolatively from $x$ to $y$ with any given $\mathcal{S}$.
Figure 2: Scaling behavior of the out-of-domain generalization performance (RMSE: root mean squared error) of the specific heat ($C_p$) prediction task with the increasing number of training samples. RMSEs of MNNs trained with E2T and conventional FCNNs are shown in blue and orange, respectively. The red dashed lines denote the generalization performance of conventional domain-inclusive learning using data from all polymer classes.
Figure 3: Scaling behavior of the out-of-domain generalization performance (RMSE) of the refractive index prediction task with the increasing number of training samples. RMSEs of MNNs trained with E2T and conventional FCNNs are shown in blue and orange, respectively. The red dashed lines denote the generalization performance of conventional domain-inclusive learning using data from all polymer classes.
Figure 4: Sensitivity analysis of E2T in the two extrapolative prediction tasks (HOIP-GeF and HOIP-PbI) using the HOIP dataset. (a) Variation of the RMSE to varying the training support size with the inference support size fixed at 1,248 (left panel) and 1,228 (right panel). (b) Variation of the RMSE for varying the inference support size at $\lambda = 100$. In the panel (a), the colored lines indicate different smoothing parameters $\lambda$. In the panel (b), the colored lines represent the different training support sizes. The shaded areas indicate the standard deviations.
Figure 5: Scaling behavior of the fine-tuned $C_p$ predictor with increasing target samples. The results of E2T are depicted in blue, while FCNN is shown in orange. Each panel represents a different polymer class. The x-axis indicates the number of samples from the target domain for fine-tuning, while the y-axis represents RMSE with the standard deviation. The red dashed line denotes the generalization performance of the model trained on the entirely sampled dataset, including the target domain.
...and 2 more figures