Table of Contents
Fetching ...

Zero-shot Sim-to-Real Transfer for Reinforcement Learning-based Visual Servoing of Soft Continuum Arms

Hsin-Jung Yang, Mahsa Khosravi, Benjamin Walt, Girish Krishnan, Soumik Sarkar

TL;DR

This work tackles the challenge of zero-shot sim-to-real transfer for reinforcement learning–based visual servoing of soft continuum arms by decoupling kinematics from mechanical properties. A two-layer framework combines an RL kinematic controller operating in Configuration Space with a local controller that refines actuation, using minimal visual sensing and a lightweight perception pipeline. Trained entirely in simulation, the policy achieves $99.8\%$ success in simulation and $67\%$ in real hardware without fine-tuning, demonstrating meaningful transfer across 3D visual servoing tasks with the BR2. The approach offers a scalable, generalizable path toward robust SCA control in unstructured 3D environments, with clear avenues for expanding DOFs, improving centering accuracy, and handling diverse targets.

Abstract

Soft continuum arms (SCAs) soft and deformable nature presents challenges in modeling and control due to their infinite degrees of freedom and non-linear behavior. This work introduces a reinforcement learning (RL)-based framework for visual servoing tasks on SCAs with zero-shot sim-to-real transfer capabilities, demonstrated on a single section pneumatic manipulator capable of bending and twisting. The framework decouples kinematics from mechanical properties using an RL kinematic controller for motion planning and a local controller for actuation refinement, leveraging minimal sensing with visual feedback. Trained entirely in simulation, the RL controller achieved a 99.8% success rate. When deployed on hardware, it achieved a 67% success rate in zero-shot sim-to-real transfer, demonstrating robustness and adaptability. This approach offers a scalable solution for SCAs in 3D visual servoing, with potential for further refinement and expanded applications.

Zero-shot Sim-to-Real Transfer for Reinforcement Learning-based Visual Servoing of Soft Continuum Arms

TL;DR

This work tackles the challenge of zero-shot sim-to-real transfer for reinforcement learning–based visual servoing of soft continuum arms by decoupling kinematics from mechanical properties. A two-layer framework combines an RL kinematic controller operating in Configuration Space with a local controller that refines actuation, using minimal visual sensing and a lightweight perception pipeline. Trained entirely in simulation, the policy achieves success in simulation and in real hardware without fine-tuning, demonstrating meaningful transfer across 3D visual servoing tasks with the BR2. The approach offers a scalable, generalizable path toward robust SCA control in unstructured 3D environments, with clear avenues for expanding DOFs, improving centering accuracy, and handling diverse targets.

Abstract

Soft continuum arms (SCAs) soft and deformable nature presents challenges in modeling and control due to their infinite degrees of freedom and non-linear behavior. This work introduces a reinforcement learning (RL)-based framework for visual servoing tasks on SCAs with zero-shot sim-to-real transfer capabilities, demonstrated on a single section pneumatic manipulator capable of bending and twisting. The framework decouples kinematics from mechanical properties using an RL kinematic controller for motion planning and a local controller for actuation refinement, leveraging minimal sensing with visual feedback. Trained entirely in simulation, the RL controller achieved a 99.8% success rate. When deployed on hardware, it achieved a 67% success rate in zero-shot sim-to-real transfer, demonstrating robustness and adaptability. This approach offers a scalable solution for SCAs in 3D visual servoing, with potential for further refinement and expanded applications.

Paper Structure

This paper contains 28 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Overview of the proposed RL-based visual servoing control framework for SCAs with zero-shot sim-to-real transfer capability. The framework is used to visual servo to view a target as shown here, in sim (top) and on real hardware (bottom). The image sequence illustrates the base camera views after each policy step, with the final distal camera view (right) showcasing the RL-based controller’s ability to locate and center the target. Demo video in https://tinyurl.com/53f5vdje.
  • Figure 2: a) Training and deployment framework of the RL kinematic controller. During training (black + blue paths), the RL agent learns a policy in simulation. In deployment (black + dashed paths), the trained policy outputs kinematic actions, translated into actuation by the local controller. b) Decoupling kinematics and mechanical properties: The RL kinematic controller handles the kinematics of the SCA, which is independent to specific hardware variations. The local controller handles the dynamics of the SCA during operation, which tries to achieve the goal configuration determined by the RL kinematic controller. c) The local controller achieves the goal configuration without using a Configuration-to-Actuation map. Current $\kappa$,$\tau$ are estimated and the configuration error is passed to the heuristic, which generates an change in actuation. This process is iterated until the goal configuration is achieved.
  • Figure 3: a) Coordinate reference. b) and c) Scatter plots of the sampled target positions in the workspace. Each dot represents the outcome of an episode, where blue dots indicate successful goal-reaching and red dots represent failures. These plots illustrate the generalization capability of the trained RL kinematic controller across the workspace, with a high density of blue dots demonstrating robust performance. d) Histogram of the distance between the target bounding box centroid and the center of the distal camera frame.
  • Figure 4: Tip View Results. a) A representative view from the tip camera. The rings indicate pixel distance thresholds. b) A histogram of best centroid distances in pixels. c) A summary of the results of testing.
  • Figure 5: Regional Results. a) A top down view of the test area showing three distances tested. b) A view into the test area showing the four test heights. Locations of tip weight testing are marked with circles. c) A summary of the results based on test point region.
  • ...and 1 more figures