Table of Contents
Fetching ...

Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired Robots

Binggwong Leung, Worasuchad Haomachai, Joachim Winther Pedersen, Sebastian Risi, Poramate Manoonpong

TL;DR

The paper tackles the brittleness of deep policy controllers in robotics under out-of-distribution (OOD) conditions and the sim-to-real gap by introducing a bio-inspired neural network with Hebbian plasticity and a weight-normalization mechanism. Trained with evolution strategies, the model is evaluated on complex 18-$DOF$ dung beetle-like and 16-$DOF$ gecko-like robots, demonstrating zero-shot sim-to-real locomotion and robust generalization to uneven terrain and morphological damage. A key contribution is the incorporation of two normalization schemes for plastic weight updates and the use of PCA to reveal the dynamic weight attractors that underlie adaptive behavior. The results show that Hebbian plasticity can yield robust, adaptable locomotion in real-world, high-DOF legged robots without terrain randomization, offering a promising alternative to more data-intensive domain randomization approaches.

Abstract

Artificial neural networks can be used to solve a variety of robotic tasks. However, they risk failing catastrophically when faced with out-of-distribution (OOD) situations. Several approaches have employed a type of synaptic plasticity known as Hebbian learning that can dynamically adjust weights based on local neural activities. Research has shown that synaptic plasticity can make policies more robust and help them adapt to unforeseen changes in the environment. However, networks augmented with Hebbian learning can lead to weight divergence, resulting in network instability. Furthermore, such Hebbian networks have not yet been applied to solve legged locomotion in complex real robots with many degrees of freedom. In this work, we improve the Hebbian network with a weight normalization mechanism for preventing weight divergence, analyze the principal components of the Hebbian's weights, and perform a thorough evaluation of network performance in locomotion control for real 18-DOF dung beetle-like and 16-DOF gecko-like robots. We find that the Hebbian-based plastic network can execute zero-shot sim-to-real adaptation locomotion and generalize to unseen conditions, such as uneven terrain and morphological damage.

Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired Robots

TL;DR

The paper tackles the brittleness of deep policy controllers in robotics under out-of-distribution (OOD) conditions and the sim-to-real gap by introducing a bio-inspired neural network with Hebbian plasticity and a weight-normalization mechanism. Trained with evolution strategies, the model is evaluated on complex 18- dung beetle-like and 16- gecko-like robots, demonstrating zero-shot sim-to-real locomotion and robust generalization to uneven terrain and morphological damage. A key contribution is the incorporation of two normalization schemes for plastic weight updates and the use of PCA to reveal the dynamic weight attractors that underlie adaptive behavior. The results show that Hebbian plasticity can yield robust, adaptable locomotion in real-world, high-DOF legged robots without terrain randomization, offering a promising alternative to more data-intensive domain randomization approaches.

Abstract

Artificial neural networks can be used to solve a variety of robotic tasks. However, they risk failing catastrophically when faced with out-of-distribution (OOD) situations. Several approaches have employed a type of synaptic plasticity known as Hebbian learning that can dynamically adjust weights based on local neural activities. Research has shown that synaptic plasticity can make policies more robust and help them adapt to unforeseen changes in the environment. However, networks augmented with Hebbian learning can lead to weight divergence, resulting in network instability. Furthermore, such Hebbian networks have not yet been applied to solve legged locomotion in complex real robots with many degrees of freedom. In this work, we improve the Hebbian network with a weight normalization mechanism for preventing weight divergence, analyze the principal components of the Hebbian's weights, and perform a thorough evaluation of network performance in locomotion control for real 18-DOF dung beetle-like and 16-DOF gecko-like robots. We find that the Hebbian-based plastic network can execute zero-shot sim-to-real adaptation locomotion and generalize to unseen conditions, such as uneven terrain and morphological damage.

Paper Structure

This paper contains 17 sections, 7 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: A neural network with Hebbian plasticity (a) is trained to control a robot in simulation and then transferred to a physical robot. A dung beetle (b) and a gecko-like robot (c) were used as experimental complex platforms in this study.
  • Figure 2: (left) Training curves: Dung-Beetle robot locomotion. The graph shows the average and standard deviation of the best individual's performance across five trials for each model. (right) Comparison of using standard deviation normalization (Hebbian-stdn, see Eq. \ref{['eq:varnorm']}) and maximum normalization (Hebbian-max, see Eq. \ref{['eq:maxnorm']}) methods to normalize the dynamical weights of the Hebbain network. Utilizing the Hebbian-max method yields a better-performing solution.
  • Figure 3: Locomotion of the real-world dung beetle-like robot. Snapshots, walking trajectories, and walking speed (m/s) are shown. The labeled numbers (_1, _2, _3) represent the ranking of the model arranged according to the training reward. Real robot locomotion can be seen at https://bit.ly/3D2ZHBf.
  • Figure 4: The robot with the Hebbian network can automatically initiate walking behavior based on foot contact inputs. When placed on the ground, the robot it begins to walk. When the robot is lifted, the legs stop oscillating (see also https://bit.ly/3D2ZHBf).
  • Figure 5: Trajectory of the plastic weights. (a) Principal component analysis (PCA) on the weight space of the Hebbian network optimized on the locomotion task in the robot simulation. (b) PCA on the hidden states of the LSTM network on the locomotion task in robot simulation. Only the LSTM network with trained parameters exhibits a limit cycle in the PCs of the hidden states. The 'x' and star symbols indicate the value at the beginning and end of the episode, respectively.
  • ...and 3 more figures