Table of Contents
Fetching ...

Training all-mechanical neural networks for task learning through in situ backpropagation

Shuaifeng Li, Xiaoming Mao

TL;DR

The paper addresses how to train mechanical neural networks (MNNs) efficiently by introducing an in situ backpropagation scheme derived from the adjoint variable method. The exact gradient $\nabla\mathcal{L} = e_{adj} \circ e$ is obtained locally in two steps, enabling gradient-based learning without centralized computation. The authors validate the approach experimentally on 3D-printed MNNs, demonstrating learning of behaviors, linear regression, and Iris classification, and show retrainability after task-switching and damage. This work paves the way for mechanical machine learning hardware and autonomous self-learning material systems, with potential extensions to nonlinear regimes and tunable materials.

Abstract

Recent advances unveiled physical neural networks as promising machine learning platforms, offering faster and more energy-efficient information processing. Compared with extensively-studied optical neural networks, the development of mechanical neural networks (MNNs) remains nascent and faces significant challenges, including heavy computational demands and learning with approximate gradients. Here, we introduce the mechanical analogue of in situ backpropagation to enable highly efficient training of MNNs. We demonstrate that the exact gradient can be obtained locally in MNNs, enabling learning through their immediate vicinity. With the gradient information, we showcase the successful training of MNNs for behavior learning and machine learning tasks, achieving high accuracy in regression and classification. Furthermore, we present the retrainability of MNNs involving task-switching and damage, demonstrating the resilience. Our findings, which integrate the theory for training MNNs and experimental and numerical validations, pave the way for mechanical machine learning hardware and autonomous self-learning material systems.

Training all-mechanical neural networks for task learning through in situ backpropagation

TL;DR

The paper addresses how to train mechanical neural networks (MNNs) efficiently by introducing an in situ backpropagation scheme derived from the adjoint variable method. The exact gradient is obtained locally in two steps, enabling gradient-based learning without centralized computation. The authors validate the approach experimentally on 3D-printed MNNs, demonstrating learning of behaviors, linear regression, and Iris classification, and show retrainability after task-switching and damage. This work paves the way for mechanical machine learning hardware and autonomous self-learning material systems, with potential extensions to nonlinear regimes and tunable materials.

Abstract

Recent advances unveiled physical neural networks as promising machine learning platforms, offering faster and more energy-efficient information processing. Compared with extensively-studied optical neural networks, the development of mechanical neural networks (MNNs) remains nascent and faces significant challenges, including heavy computational demands and learning with approximate gradients. Here, we introduce the mechanical analogue of in situ backpropagation to enable highly efficient training of MNNs. We demonstrate that the exact gradient can be obtained locally in MNNs, enabling learning through their immediate vicinity. With the gradient information, we showcase the successful training of MNNs for behavior learning and machine learning tasks, achieving high accuracy in regression and classification. Furthermore, we present the retrainability of MNNs involving task-switching and damage, demonstrating the resilience. Our findings, which integrate the theory for training MNNs and experimental and numerical validations, pave the way for mechanical machine learning hardware and autonomous self-learning material systems.
Paper Structure (9 sections, 10 equations, 5 figures)

This paper contains 9 sections, 10 equations, 5 figures.

Figures (5)

  • Figure 1: Experimental demonstration of the in situ backpropagation.a 3D-printed mechanical networks using the Polyjet rubber-like material Agilus30. b The experimental setups for the forward field and the adjoint field, and the resulting gradient of the loss function are shown from left to right. c The experimental elongations of forward field and adjoint field are shown in the first and second panel, respectively. The experimental gradient is shown in the third panel. d The corresponding simulated elongations and gradient are shown from left to right. e The comparison between the finite difference method and our adjoint method. The left panel shows the numerical error as a function of step size in finite difference method. The shaded area represents the experimentally feasible region with large step sizes, below which the step size $\Delta k$ is too small for manufacturing accuracy. The inset shows the experimental error in adjoint method. The numerical error of the adjoint method is zero. The right panel shows the number of required simulations of the finite difference method and the adjoint method as a function of the number of bonds in the MNNs to obtain the gradient.
  • Figure 2: Behaviors learning using MNNs.a The symmetric output under the applied force of the MNN before training. The top panel shows the configuration of mechanical networks. The bottom panel shows the simulated and experimental vertical displacements $u_{y}$ of two nodes. b The loss and the absolute difference of vertical displacements of two nodes $|\Delta u_{y}|$ as a function of iteration in the training process. c, d Two asymmetric outputs under the applied force as a result of the training. The top panels in c and d show the configuration of MNNs. The bottom panels in c and d show the simulated and experimental vertical displacements $u_{y}$ of two nodes. The blue triangles, red dots and cyan stars in the top panels of a, c and d represent the fixed nodes, the input node and the output nodes, respectively. The error bars in the bottom panels of a, c and d are calculated based on standard deviation of three independent experiments.
  • Figure 3: Linear regression using MNNs.a The synthetic noise-free and noisy dataset (circles) are exhibited from left to right. The regression results are shown in solid lines. b The loss for the noise-free dataset (purple) and noise dataset (green) as a function of epoch in training process. c The simulated regression results (solid lines) when the epoch is $1$, $10$ and $5000$ are shown from left to right, respectively. The experimental regression results are exhibited in circles when conducted by the MNN at epoch $5000$. Note that experimental results of three independent experiments are almost overlapped. d The trained configuration of MNNs for regression tasks. The blue triangles, red dot and stars represent the fixed boundary, the input node and the output nodes, respectively. e The experimental setup for the regression task when the input force is equivalent to $6~\mathrm{g}$.
  • Figure 4: Classification using MNNs.a The Iris flower classification dataset. The relation between sepal length and petal length is visualized. b The loss (purple) and classification accuracy (orange for training set and blue for testing set) as a function of epoch in training process. The inset shows the trained configuration of MNNs. The blue triangles and red dots represent the fixed boundary and the input nodes, respectively. The symbols used in $\textbf{a}$ are shown in the inset of $\textbf{b}$ to represent the output nodes for corresponding type of Iris flowers. c The classification results when the epoch is $10$, $20$ and $100$ are shown from left to right, respectively. d The comparison of classification results between simulation and experiment when conducted in the MNN at epoch $100$. The insets display the experimental setups. The error bars are calculated based on standard deviation of three independent experiments.
  • Figure 5: Retrainable mechanical networks.a The loss (purple) and classification accuracy (orange for training set and blue for testing set) as a function of epoch in the training process of Iris flower classification task. The inset shows the trained MNN. This MNN is subsequently taken as the initial system for new task training (top) and retraining after damage (bottom), respectively. b The loss (purple) and regression accuracy (orange) as a function of epoch in the training process when using noise-free dataset and trained MNN of the classification task as an initial MNN. The inset shows the trained MNN. c The loss (purple) and classification accuracy (orange and blue) as a function of epoch in training process when using trained MNN of the regression task as initial MNN. The inset shows the trained MNN. d The schematic shows that a bond of the MNN for classification tasks is pruned. e The loss (purple) and classification accuracy (orange for training set and blue for testing set) as a function of epoch in training process when using pruned MNN as an initial MNN. The inset shows the trained MNN. The blue triangles, red dots and cyan stars in MNNs represent the fixed nodes, the input nodes and the output nodes, respectively.