Table of Contents
Fetching ...

Octopus-inspired Distributed Control for Soft Robotic Arms: A Graph Neural Network-Based Attention Policy with Environmental Interaction

Linxin Hou, Qirui Wu, Zhihang Qin, Yongxin Guo, Cecilia Laschi

TL;DR

Robustness tests with observation noise, single-section actuation failure, and transient disturbances show that SoftGM preserves success while keeping control effort bounded, indicating resilient coordination driven by selective contact-relevant information routing.

Abstract

This paper proposes SoftGM, an octopus-inspired distributed control architecture for segmented soft robotic arms that learn to reach targets in contact-rich environments using online obstacle discovery without relying on global obstacle geometry. SoftGM formulates each arm section as a cooperative agent and represents the arm-environment interaction as a graph. SoftGM uses a two-stage graph attention message passing scheme following a Centralised Training Decentralised Execution (CTDE) paradigm with a centralised critic and decentralised actor. We evaluate SoftGM in a Cosserat-rod simulator (PyElastica) across three tasks that increase the complexity of the environment: obstacle-free, structured obstacles, and a wall-with-hole scenario. Compared with six widely used MARL baselines (IDDPG, IPPO, ISAC, MADDPG, MAPPO, MASAC) under identical information content and training conditions, SoftGM matches strong CTDE methods in simpler settings and achieves the best performance in the wall-with-hole task. Robustness tests with observation noise, single-section actuation failure, and transient disturbances show that SoftGM preserves success while keeping control effort bounded, indicating resilient coordination driven by selective contact-relevant information routing.

Octopus-inspired Distributed Control for Soft Robotic Arms: A Graph Neural Network-Based Attention Policy with Environmental Interaction

TL;DR

Robustness tests with observation noise, single-section actuation failure, and transient disturbances show that SoftGM preserves success while keeping control effort bounded, indicating resilient coordination driven by selective contact-relevant information routing.

Abstract

This paper proposes SoftGM, an octopus-inspired distributed control architecture for segmented soft robotic arms that learn to reach targets in contact-rich environments using online obstacle discovery without relying on global obstacle geometry. SoftGM formulates each arm section as a cooperative agent and represents the arm-environment interaction as a graph. SoftGM uses a two-stage graph attention message passing scheme following a Centralised Training Decentralised Execution (CTDE) paradigm with a centralised critic and decentralised actor. We evaluate SoftGM in a Cosserat-rod simulator (PyElastica) across three tasks that increase the complexity of the environment: obstacle-free, structured obstacles, and a wall-with-hole scenario. Compared with six widely used MARL baselines (IDDPG, IPPO, ISAC, MADDPG, MAPPO, MASAC) under identical information content and training conditions, SoftGM matches strong CTDE methods in simpler settings and achieves the best performance in the wall-with-hole task. Robustness tests with observation noise, single-section actuation failure, and transient disturbances show that SoftGM preserves success while keeping control effort bounded, indicating resilient coordination driven by selective contact-relevant information routing.
Paper Structure (20 sections, 6 equations, 6 figures, 2 tables)

This paper contains 20 sections, 6 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview of the SoftGM framework with graph construction and attention-based message passing.
  • Figure 2: Snapshots of the simulated soft robotic arm (blue) in the three task scenarios. (a) Basic (no obstacle). (b) Structured obstacles. (c) Wall-with-hole.
  • Figure 3: Learning curves of episodic rewards, evaluation success rates and mean episode lengths for SoftGM and the six MARL benchmarks in the three task scenarios over 3 random seeds.
  • Figure 4: Visualisation of the attention matrices of the soft rod in three task scenarios. Each row shows the change of the attention matrix throughout one successful episode in the three scenarios. The higher the attention score is, the darker the colour.
  • Figure 5: Illustration of the ideal and three non-ideal situations for robustness evaluation of SoftGM. (a) Ideal situation. (b) Noisy Environment. (c) One section of the soft arm fails. (d) Disturbance.
  • ...and 1 more figures