Table of Contents
Fetching ...

Input Specific Neural Networks

Asghar A. Jadoon, D. Thomas Seidl, Reese E. Jones, Jan N. Fuhg

TL;DR

The paper addresses the need to encode physical structure in neural models by introducing Input Specific Neural Networks (ISNNs) that enforce multiple input-specific constraints on a scalar output. It presents two architectures with analytically derived first- and second-order derivatives, enabling constrained learning and reliable integration with finite-element solvers via manual differentiation. The approach is demonstrated in data-driven constitutive modeling for hyperelasticity, including forward/inverse problems and FE embedding, and is extended with a binary gating mechanism to discover whether a dataset is polyconvex or arbitrary. The results show improved extrapolation, reduced input requirements compared to prior methods, and practical benefits for cross-platform deployment in commercial solvers, with potential for learning structural relationships in complex multiscale settings.

Abstract

The black-box nature of neural networks limits the ability to encode or impose specific structural relationships between inputs and outputs. While various studies have introduced architectures that ensure the network's output adheres to a particular form in relation to certain inputs, the majority of these approaches impose constraints on only a single set of inputs. This paper introduces a novel neural network architecture, termed the Input Specific Neural Network (ISNN), which extends this concept by allowing scalar-valued outputs to be subject to multiple constraints. Specifically, the ISNN can enforce convexity in some inputs, non-decreasing monotonicity combined with convexity with respect to others, and simple non-decreasing monotonicity or arbitrary relationships with additional inputs. The paper presents two distinct ISNN architectures, along with equations for the first and second derivatives of the output with respect to the inputs. These networks are broadly applicable. In this work, we restrict their usage to solving problems in computational mechanics. In particular, we show how they can be effectively applied to fitting data-driven constitutive models. We then embed our trained data-driven constitutive laws into a finite element solver where significant time savings can be achieved by using explicit manual differentiation using the derived equations as opposed to automatic differentiation. We also show how ISNNs can be used to learn structural relationships between inputs and outputs via a binary gating mechanism. Particularly, ISNNs are employed to model an anisotropic free energy potential to get the homogenized macroscopic response in a decoupled multiscale setting, where the network learns whether or not the potential should be modeled as polyconvex, and retains only the relevant layers while using the minimum number of inputs.

Input Specific Neural Networks

TL;DR

The paper addresses the need to encode physical structure in neural models by introducing Input Specific Neural Networks (ISNNs) that enforce multiple input-specific constraints on a scalar output. It presents two architectures with analytically derived first- and second-order derivatives, enabling constrained learning and reliable integration with finite-element solvers via manual differentiation. The approach is demonstrated in data-driven constitutive modeling for hyperelasticity, including forward/inverse problems and FE embedding, and is extended with a binary gating mechanism to discover whether a dataset is polyconvex or arbitrary. The results show improved extrapolation, reduced input requirements compared to prior methods, and practical benefits for cross-platform deployment in commercial solvers, with potential for learning structural relationships in complex multiscale settings.

Abstract

The black-box nature of neural networks limits the ability to encode or impose specific structural relationships between inputs and outputs. While various studies have introduced architectures that ensure the network's output adheres to a particular form in relation to certain inputs, the majority of these approaches impose constraints on only a single set of inputs. This paper introduces a novel neural network architecture, termed the Input Specific Neural Network (ISNN), which extends this concept by allowing scalar-valued outputs to be subject to multiple constraints. Specifically, the ISNN can enforce convexity in some inputs, non-decreasing monotonicity combined with convexity with respect to others, and simple non-decreasing monotonicity or arbitrary relationships with additional inputs. The paper presents two distinct ISNN architectures, along with equations for the first and second derivatives of the output with respect to the inputs. These networks are broadly applicable. In this work, we restrict their usage to solving problems in computational mechanics. In particular, we show how they can be effectively applied to fitting data-driven constitutive models. We then embed our trained data-driven constitutive laws into a finite element solver where significant time savings can be achieved by using explicit manual differentiation using the derived equations as opposed to automatic differentiation. We also show how ISNNs can be used to learn structural relationships between inputs and outputs via a binary gating mechanism. Particularly, ISNNs are employed to model an anisotropic free energy potential to get the homogenized macroscopic response in a decoupled multiscale setting, where the network learns whether or not the potential should be modeled as polyconvex, and retains only the relevant layers while using the minimum number of inputs.

Paper Structure

This paper contains 13 sections, 49 equations, 23 figures.

Figures (23)

  • Figure 1: ISNN-1 schematic with the terms inside the activation functions corresponding to those from Eq. \ref{['yHtype1_1']} to Eq. \ref{['xHtype1_1']}.
  • Figure 2: ISNN-2 schematic with the terms inside the activation functions corresponding to those from Eq. \ref{['y_eq_2']} to Eq. \ref{['x_eq_2']}.
  • Figure 3: (a) Training loss and the (b) test loss for both ISNN architectures and the FFNN for the dataset generated using Eq. \ref{['eq_f']}. For each architecture, the solid line shows the mean loss over 10 different initializations whereas the shaded region shows the standard deviation.
  • Figure 4: Each model's predictive behavior for data seen during training, denoted with Interpolated response and on unseen data denoted with Extrapolated response for the dataset generated using Eq. \ref{['eq_f']}. For each architecture, the solid line shows the mean loss over 10 different initializations whereas the shaded region shows the standard deviation.
  • Figure 5: (a) Training and (b) test losses for both ISNN architectures and the FFNN for the dataset generated using Eq. \ref{['eq_g']}. For each architecture, the solid line shows the mean loss over 10 different initializations whereas the shaded region shows the standard deviation.
  • ...and 18 more figures

Theorems & Definitions (3)

  • Remark 1
  • Remark 2
  • Remark 3