Table of Contents
Fetching ...

Foundation Model for Polycrystalline Material Informatics

Ting-Ju Wei, Chuin-Shan Chen

TL;DR

This work develops a 3D polycrystal foundation model pretrained with a self-supervised masked autoencoder on a large, texture-space–covering dataset of FCC RVEs. The latent representations, captured via a quaternion-valued patch-based encoder, transfer effectively to downstream tasks: predicting homogenized stiffness and inferring ODMN parameters for nonlinear crystal-plasticity–based homogenization. Across experiments, the pretrained encoder consistently outperforms non-pretrained baselines, with 40% masking offering optimal generalization, and the integrated ODMN accurately reproduces nonlinear responses for unseen microstructures. The approach demonstrates strong transferability in data-scarce regimes and offers a pathway to incorporate experimental microstructures for texture-informed microstructure–property reasoning in materials design.

Abstract

We present a three-dimensional polycrystal foundation model based on a masked autoencoder that learns intrinsic microstructural representations through large-scale self-supervised pretraining on voxel-based data. The pretraining dataset consists of 100,000 face-centered cubic (FCC) microstructures whose crystallographic textures span the texture hull via hierarchical simplex sampling. The quality and transferability of the learned representations are evaluated through two downstream tasks: (i) homogenized stiffness prediction and (ii) nonlinear homogenized response prediction. In the latter, the pretrained encoder is coupled with an orientation-aware interaction-based deep material network (ODMN), where the learned latent representations are used to infer microstructure-dependent ODMN parameters. This enables accurate stress-strain predictions for previously unseen microstructures under crystal plasticity. Across both tasks, the pretrained encoder consistently exhibits superior generalization performance compared to non-pretrained baselines. These results demonstrate the strong transferability of the proposed foundation model and its effectiveness in data-scarce scientific settings with limited labeled microstructures. The framework further enables scalable integration with experimentally derived microstructures, providing a practical basis for microstructure-property reasoning in materials design.

Foundation Model for Polycrystalline Material Informatics

TL;DR

This work develops a 3D polycrystal foundation model pretrained with a self-supervised masked autoencoder on a large, texture-space–covering dataset of FCC RVEs. The latent representations, captured via a quaternion-valued patch-based encoder, transfer effectively to downstream tasks: predicting homogenized stiffness and inferring ODMN parameters for nonlinear crystal-plasticity–based homogenization. Across experiments, the pretrained encoder consistently outperforms non-pretrained baselines, with 40% masking offering optimal generalization, and the integrated ODMN accurately reproduces nonlinear responses for unseen microstructures. The approach demonstrates strong transferability in data-scarce regimes and offers a pathway to incorporate experimental microstructures for texture-informed microstructure–property reasoning in materials design.

Abstract

We present a three-dimensional polycrystal foundation model based on a masked autoencoder that learns intrinsic microstructural representations through large-scale self-supervised pretraining on voxel-based data. The pretraining dataset consists of 100,000 face-centered cubic (FCC) microstructures whose crystallographic textures span the texture hull via hierarchical simplex sampling. The quality and transferability of the learned representations are evaluated through two downstream tasks: (i) homogenized stiffness prediction and (ii) nonlinear homogenized response prediction. In the latter, the pretrained encoder is coupled with an orientation-aware interaction-based deep material network (ODMN), where the learned latent representations are used to infer microstructure-dependent ODMN parameters. This enables accurate stress-strain predictions for previously unseen microstructures under crystal plasticity. Across both tasks, the pretrained encoder consistently exhibits superior generalization performance compared to non-pretrained baselines. These results demonstrate the strong transferability of the proposed foundation model and its effectiveness in data-scarce scientific settings with limited labeled microstructures. The framework further enables scalable integration with experimentally derived microstructures, providing a practical basis for microstructure-property reasoning in materials design.

Paper Structure

This paper contains 17 sections, 22 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Schematic illustration of the proposed polycrystal foundation model during the self-supervised pretraining stage. The colored voxel grids within the RVE symbolically represent different crystallographic orientations and are shown for illustrative purposes only.
  • Figure 2: Schematic illustration of the downstream workflow for homogenized stiffness prediction. The pretrained encoder extracts texture-aware latent representations, which are subsequently mapped to homogenized stiffness components through a linear regression head.
  • Figure 3: Schematic illustration of the downstream workflow for nonlinear homogenized response prediction. (a) During the downstream offline training stage, the pretrained encoder extracts texture-aware latent features, which are subsequently passed through a linear regression head to predict the ODMN parameters. (b) During the online prediction stage, the inferred ODMN is coupled with the constituent phase material behavior to enable nonlinear homogenized response prediction.
  • Figure 4: Pretraining loss curves for different masking ratios ranging from 20% to 90%.
  • Figure 5: UMAP projection of the latent representations extracted from the pretrained encoder (masking ratio 40%). Each gray point corresponds to a single RVE from the pretraining dataset.
  • ...and 5 more figures