Table of Contents
Fetching ...

Foundation Model for Composite Microstructures: Reconstruction, Stiffness, and Nonlinear Behavior Prediction

Ting-Ju Wei, Chuin-Shan Chen

TL;DR

The paper addresses the challenge of predicting mechanical properties from microstructure with limited labeled data by introducing a microstructure foundation model, the Material Masked Autoencoder (MMAE), pretrained in a self-supervised manner on 2D short-fiber RVEs. It demonstrates two downstream paths: (i) transfer learning from MMAE embeddings to predict homogenized stiffness components with linear probing or fine-tuning, and (ii) coupling the MMAE with an Interaction-based Material Network (IMN) to infer IMN parameters for nonlinear stress–strain extrapolation, enabling online predictions for unseen microstructures. Key contributions include the first microstructure-oriented foundation model, evidence of data-efficient stiffness prediction (up to $R^2 \approx 0.96$) and accurate nonlinear extrapolation (mean-relative errors around a few percent), and a framework that maps microstructure images directly to physically interpretable IMN parameters. The work lays groundwork for extending to 3D composites and integrating experimental data, promising robust, geometry-aware surrogate models for materials design and analysis.

Abstract

We present the Material Masked Autoencoder (MMAE), a self-supervised Vision Transformer pretrained on a large corpus of short-fiber composite images via masked image reconstruction. The pretrained MMAE learns latent representations that capture essential microstructural features and are broadly transferable across tasks. We demonstrate two key applications: (i) predicting homogenized stiffness components through fine-tuning on limited data, and (ii) inferring physically interpretable parameters by coupling MMAE with an interaction-based material network (IMN), thereby enabling extrapolation of nonlinear stress-strain responses. These results highlight the promise of microstructure foundation models and lay the groundwork for future extensions to more complex systems, such as 3D composites and experimental datasets.

Foundation Model for Composite Microstructures: Reconstruction, Stiffness, and Nonlinear Behavior Prediction

TL;DR

The paper addresses the challenge of predicting mechanical properties from microstructure with limited labeled data by introducing a microstructure foundation model, the Material Masked Autoencoder (MMAE), pretrained in a self-supervised manner on 2D short-fiber RVEs. It demonstrates two downstream paths: (i) transfer learning from MMAE embeddings to predict homogenized stiffness components with linear probing or fine-tuning, and (ii) coupling the MMAE with an Interaction-based Material Network (IMN) to infer IMN parameters for nonlinear stress–strain extrapolation, enabling online predictions for unseen microstructures. Key contributions include the first microstructure-oriented foundation model, evidence of data-efficient stiffness prediction (up to ) and accurate nonlinear extrapolation (mean-relative errors around a few percent), and a framework that maps microstructure images directly to physically interpretable IMN parameters. The work lays groundwork for extending to 3D composites and integrating experimental data, promising robust, geometry-aware surrogate models for materials design and analysis.

Abstract

We present the Material Masked Autoencoder (MMAE), a self-supervised Vision Transformer pretrained on a large corpus of short-fiber composite images via masked image reconstruction. The pretrained MMAE learns latent representations that capture essential microstructural features and are broadly transferable across tasks. We demonstrate two key applications: (i) predicting homogenized stiffness components through fine-tuning on limited data, and (ii) inferring physically interpretable parameters by coupling MMAE with an interaction-based material network (IMN), thereby enabling extrapolation of nonlinear stress-strain responses. These results highlight the promise of microstructure foundation models and lay the groundwork for future extensions to more complex systems, such as 3D composites and experimental datasets.

Paper Structure

This paper contains 30 sections, 11 equations, 15 figures, 3 tables, 1 algorithm.

Figures (15)

  • Figure 1: Self-supervised pre-training architecture of the MMAE. The encoder extracts latent features from visible microstructure patches, while the decoder reconstructs the original microstructure image using these extracted embeddings and positional encodings. Light blue regions indicate trainable components; grey blocks are frozen modules; green arrows represent data flow.
  • Figure 2: Transfer learning strategies utilizing MMAE embeddings: (a) Linear probing with frozen encoder; (b) Fine-tuning with trainable encoder and prediction head. Grey blocks indicate frozen parameters; light blue blocks indicate trainable parameters; green arrows depict data flow.
  • Figure 3: End-to-end transfer learning framework from MMAE to IMN. Each input microstructure image $\mathbf{I}$ is encoded by the pre-trained MMAE, and its [CLS] token serves as a latent representation. A linear projection head maps this embedding to predict the IMN parameter set $\mathcal{F}_{\text{IMN}}(\mathbf{I})$. The IMN then uses these parameters, along with the stiffness matrices of phase 1 and phase 2 ($\mathbf{C}^{p1}$, $\mathbf{C}^{p2}$), to compute the predicted homogenized stiffness tensor $\bar{\mathbf{C}}$. Grey blocks indicate frozen (non-trainable) modules, light blue blocks denote trainable components, and green arrows represent the data flow throughout the pipeline.
  • Figure 4: Schematic of online nonlinear prediction using the inferred IMN. (a) A previously unseen microstructure image is encoded by the MMAE and projected to a set of IMN parameters. (b) The inferred IMN, when combined with phase-specific constitutive models (e.g., elasticity or plasticity), enables nonlinear response prediction.
  • Figure 5: Microstructure reconstruction results using the pretrained MMAE under two masking ratios. Each row ((a)–(d)) corresponds to a randomly selected microstructure from the pretraining dataset, differing in key morphological descriptors such as aspect ratio, volume fraction, and number of inclusions. From left to right: original image, masked input, and reconstruction under 75% masking; followed by masked input and reconstruction under 85% masking.
  • ...and 10 more figures