Learning to Crawl: Latent Model-Based Reinforcement Learning for Soft Robotic Adaptive Locomotion
Vaughn Gzenda, Robin Chhabra
TL;DR
The paper tackles enabling soft robotic crawlers to learn locomotion policies from noisy sensor data without explicit continuum-body models. It introduces a latent dynamics model learned from IMU and TOF measurements and embeds it in a Dreamer-style actor-critic framework to optimize periodic gait parameters. Perception is guided by a variational free energy objective, while latent predictions drive short-horizon planning for policy optimization. In simulation, the approach yields effective gaits that achieve forward locomotion toward a target within roughly 14 seconds, demonstrating robustness to sensor noise and potential for autonomous soft-robot locomotion.
Abstract
Soft robotic crawlers are mobile robots that utilize soft body deformability and compliance to achieve locomotion through surface contact. Designing control strategies for such systems is challenging due to model inaccuracies, sensor noise, and the need to discover locomotor gaits. In this work, we present a model-based reinforcement learning (MB-RL) framework in which latent dynamics inferred from onboard sensors serve as a predictive model that guides an actor-critic algorithm to optimize locomotor policies. We evaluate the framework on a minimal crawler model in simulation using inertial measurement units and time-of-flight sensors as observations. The learned latent dynamics enable short-horizon motion prediction while the actor-critic discovers effective locomotor policies. This approach highlights the potential of latent-dynamics MB-RL for enabling embodied soft robotic adaptive locomotion based solely on noisy sensor feedback.
