Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

Henrik Krauss; Johann Licher; Naoya Takeishi; Annika Raatz; Takehisa Yairi

Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

Henrik Krauss, Johann Licher, Naoya Takeishi, Annika Raatz, Takehisa Yairi

TL;DR

This work tackles the interpretability gap in data-driven soft continuum robot dynamics learned from video. It introduces the Attention Broadcast Decoder (ABCD), a plug-and-play autoencoder module that outputs pixel-precise attention maps per latent and decouples static background, enabling direct on-image visualization when coupled with 2D oscillator networks. By adding an attention-coupling mechanism, the approach provides physically meaningful visualization of latent dynamics (masses, stiffness, forces) on the robot image, and discovers chain-structured oscillators for multi-segment SCRs. Empirically, ABCD improves multi-step prediction accuracy and enables smooth latent-space extrapolation, while maintaining a compact, physically interpretable model suitable for control and extension to 3D or multi-camera setups.

Abstract

Data-driven learning of soft continuum robot (SCR) dynamics from high-dimensional observations offers flexibility but often lacks physical interpretability, while model-based approaches require prior knowledge and can be computationally expensive. We bridge this gap by introducing (1) the Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics learning that generates pixel-accurate attention maps localizing each latent dimension's contribution while filtering static backgrounds. (2) By coupling these attention maps to 2D oscillator networks, we enable direct on-image visualization of learned dynamics (masses, stiffness, and forces) without prior knowledge. We validate our approach on single- and double-segment SCRs, demonstrating that ABCD-based models significantly improve multi-step prediction accuracy: 5.7x error reduction for Koopman operators and 3.5x for oscillator networks on the two-segment robot. The learned oscillator network autonomously discovers a chain structure of oscillators. Unlike standard methods, ABCD models enable smooth latent space extrapolation beyond training data. This fully data-driven approach yields compact, physically interpretable models suitable for control applications.

Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

TL;DR

Abstract

Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)