Neural networks meet anisotropic hyperelasticity: A framework based on generalized structure tensors and isotropic tensor functions

Karl A. Kalina; Jörg Brummund; WaiChing Sun; Markus Kästner

Neural networks meet anisotropic hyperelasticity: A framework based on generalized structure tensors and isotropic tensor functions

Karl A. Kalina, Jörg Brummund, WaiChing Sun, Markus Kästner

TL;DR

This work presents a physics-augmented neural network (PANN) framework for anisotropic finite-strain hyperelasticity that leverages generalized structure tensors up to order 6 to encode material symmetry. By training parameterized GSTs together with a neural-energy surrogate and incorporating trainable gates for sparsity, the method automatically detects anisotropy classes and orientations from multiscale homogenization data. The approach achieves excellent interpolation and notably superior extrapolation performance compared to coordinate-based NN models, while enforcing key physical constraints by construction. With data generated from five RVEs, the method demonstrates robust anisotropy identification and compact, accurate macroscopic surrogates suitable for multiscale FE implementations and real experimental data assimilation.

Abstract

We present a data-driven framework for the multiscale modeling of anisotropic finite strain elasticity based on physics-augmented neural networks (PANNs). Our approach allows the efficient simulation of materials with complex underlying microstructures which reveal an overall anisotropic and nonlinear behavior on the macroscale. By using a set of invariants as input, an energy-type output and by adding several correction terms to the overall energy density functional, the model fulfills multiple physical principles by construction. The invariants are formed from the right Cauchy-Green deformation tensor and fully symmetric 2nd, 4th or 6th order structure tensors which enables to describe a wide range of symmetry groups. Besides the network parameters, the structure tensors are simultaneously calibrated during training so that the underlying anisotropy of the material is reproduced most accurately. In addition, sparsity of the model with respect to the number of invariants is enforced by adding a trainable gate layer and using lp regularization. Our approach works for data containing tuples of deformation, stress and material tangent, but also for data consisting only of tuples of deformation and stress, as is the case in real experiments. The developed approach is exemplarily applied to several representative examples, where necessary data for the training of the PANN surrogate model are collected via computational homogenization. We show that the proposed model achieves excellent interpolation and extrapolation behaviors. In addition, the approach is benchmarked against an NN model based on the components of the right Cauchy-Green deformation tensor.

Neural networks meet anisotropic hyperelasticity: A framework based on generalized structure tensors and isotropic tensor functions

TL;DR

Abstract

Paper Structure (57 sections, 80 equations, 16 figures, 5 tables)

This paper contains 57 sections, 80 equations, 16 figures, 5 tables.

Introduction
Application of neural networks in constitutive modeling
Objectives and contributions of this work
Notation
Fundamentals
Kinematics and stress measures
Physical conditions for anisotropic finite strain hyperelasticity
Concept of structure tensors and isotropic tensor functions
Scale transition scheme
Macroscale modeling with generalized structure tensors
2nd order generalized structure tensor
4th order generalized structure tensor
6th order generalized structure tensor
Two 2nd order generalized structure tensors
Physics-augmented neural networks with anisotropy detection
...and 42 more sections

Figures (16)

Figure 1: Illustration of the neural network $\bar{\psi}^\text{NN}(\,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^\square)$ for the representation of the elastic potential described by an invariant set $\,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^\square$ build from $\bar{\boldsymbol C}:=\bar{\boldsymbol F}^T \cdot \bar{\boldsymbol F}$ and one or two structure tensor(s) $\square \in\{\bar{\boldsymbol G},\bar{\mathbb G}, \bar{\boldsymbol{\mathsf G}},(\bar{\boldsymbol G}_{1},\bar{\boldsymbol G}_{2})\}$, i.e., the mapping is $\bar{\psi}^\text{NN}: \mathbb R^n \to \mathbb R_{\ge 0}\,,\; \,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^\square \mapsto \bar{\psi}^\text{NN}(\,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^\square) := (n^\text{out} \circ g^\text{NN} \circ \boldsymbol{\mathscr l}^\text{gate} \circ \boldsymbol{\mathscr n}^\text{in})(\,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^\square)$. Therein, $\boldsymbol{\mathscr n}^\text{in}(\,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^\square)$ and $n^\text{out}(\mathfrak p^\text{NN})$ are non-trainable normalization layers that have to fitted before training, $\boldsymbol{\mathscr l}^\text{gate}(\bar{\boldsymbol{\!\mathscr i}}^\square)$ is a trainable gate layer and $g^\text{NN}(\bar{\boldsymbol{\!\mathscr i}}^\square \odot \boldsymbol{g})$ is a standard PNN guaranteeing positive outputs. The vector $\boldsymbol{\mathscr m}_\square$ includes the parameters of the structure tensors and is also trainable.
Figure 2: Data generation and model identification procedure: (a) the deformation space is sampled in a prescribed range by Latin Hypercube Sampling (LHS), (b) by prescribing the sampled deformation gradients ${}^i\bar{\boldsymbol F}$ in RVE simulations, corresponding energy ${}^i\bar{\psi}$, stress ${}^i\bar{\boldsymbol \sigma}$ and elasticity tensor ${}^i\bar{\mathbbm c}$ are calculated, (c) the dataset $\mathcal{D}$ is used to identify and calibrate the NN-based model $\bar{\psi}(\,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^\square,\bar{J})$ via a higher-order Sobolev training with prediction loss $\mathcal{L}^\text{pred}= w^\psi \mathcal{L}^{\psi} + w^{\boldsymbol {\sigma}}\mathcal{L}^{\boldsymbol {\sigma}} + w^{\mathbbm c}\mathcal{L}^{\mathbbm c}$ and gate loss $\mathcal{L}^\text{gate}$ for sparsity. The step (c) involves the substeps (c.1)--(c.4): to identify which (set of) structure tensor(s) $\square \in \{\bar{\boldsymbol G}, \bar{\mathbb G}, \bar{\boldsymbol{\mathsf G}},(\bar{\boldsymbol G}_{1},\bar{\boldsymbol G}_{2})\}$, i.e., 2nd, 4th, 6th, or two 2nd order structure tensor(s), is needed, the training is performed sequentially from low to high structure tensor order and an error control is performed after each training to decide whether the underlying anisotropy can be described with sufficient accuracy.
Figure 3: Considered RVEs for data generation: (a) fiber reinforced material (stochastic fibers), (b) unit cell with hexagonal fiber arrangement (hexagonal fibers), (c) unit cell with one spherical inclusion (cubic sphere), (d) particle reinforced plane-like microstructure (plane spheres), and (e) particle reinforced chain-like microstructure (chain spheres). The volume fractions of the fiber/particle phase are given by $\phi \in \{ 30, 30, 20, 12, 15\}\,\%$ from left to right.
Figure 4: Sampled deformation space comprising 141 loading paths with 20 increments each. Shown are sectional planes of the Green-Lagrange strain tensor $\bar{\boldsymbol E}$.
Figure 5: Training process of the invariant-based NN model $\bar{\psi}^{\bar{\boldsymbol G}}(\,\bar{\boldsymbol{\!\boldsymbol{\mathcal{I}}}}^{\bar{\boldsymbol G}},\bar{J})$ for the RVE plane spheres with the loss $\mathcal{L} = \mathcal{L}^\text{pred} + 5.0e-5 \mathcal{L}^\text{gate}$, $\mathcal{L}^\text{pred} = 0.7\mathcal{L}^{\boldsymbol {\sigma}} + 0.3 \mathcal{L}^{\mathbbm c}$: (a) pre-training with Adam optimizer and (b) post-training with SLSQP optimizer. Shown is the prediction loss for five training runs.
...and 11 more figures

Theorems & Definitions (5)

Remark 1
Remark 2
Remark 3
Remark 4
Remark 5

Neural networks meet anisotropic hyperelasticity: A framework based on generalized structure tensors and isotropic tensor functions

TL;DR

Abstract

Neural networks meet anisotropic hyperelasticity: A framework based on generalized structure tensors and isotropic tensor functions

Authors

TL;DR

Abstract

Table of Contents

Figures (16)

Theorems & Definitions (5)