Task-Based Design and Policy Co-Optimization for Tendon-driven Underactuated Kinematic Chains

Sharfin Islam; Zhanpeng He; Matei Ciocarlie

Task-Based Design and Policy Co-Optimization for Tendon-driven Underactuated Kinematic Chains

Sharfin Islam, Zhanpeng He, Matei Ciocarlie

TL;DR

The paper tackles the challenge of designing and controlling underactuated tendon-driven manipulators by formulating a general $N$-link, $M$-actuator forward model and applying MORPH-based end-to-end co-optimization to learn both hardware parameters $\phi$ (e.g., radii, pretensions) and a policy $\pi_\theta$ for reaching tasks. The approach enables end-to-end optimization despite non-differentiable physics by using a neural proxy and CMA-ES to adjust hardware design, validated on a 3-link, 2-actuator tentacle with real-hardware transfer. Experimental results show improved task performance, sub-millimeter real-world accuracy in some setups, and substantial sim-to-real transfer gains when using closed-loop control. Overall, the work demonstrates that task-based design and policy co-optimization can yield flexible, compact tendon-driven robots that transfer effectively to real hardware.

Abstract

Underactuated manipulators reduce the number of bulky motors, thereby enabling compact and mechanically robust designs. However, fewer actuators than joints means that the manipulator can only access a specific manifold within the joint space, which is particular to a given hardware configuration and can be low-dimensional and/or discontinuous. Determining an appropriate set of hardware parameters for this class of mechanisms, therefore, is difficult - even for traditional task-based co-optimization methods. In this paper, our goal is to implement a task-based design and policy co-optimization method for underactuated, tendon-driven manipulators. We first formulate a general model for an underactuated, tendon-driven transmission. We then use this model to co-optimize a three-link, two-actuator kinematic chain using reinforcement learning. We demonstrate that our optimized tendon transmission and control policy can be transferred reliably to physical hardware with real-world reaching experiments.

Task-Based Design and Policy Co-Optimization for Tendon-driven Underactuated Kinematic Chains

TL;DR

The paper tackles the challenge of designing and controlling underactuated tendon-driven manipulators by formulating a general

-link,

-actuator forward model and applying MORPH-based end-to-end co-optimization to learn both hardware parameters

(e.g., radii, pretensions) and a policy

for reaching tasks. The approach enables end-to-end optimization despite non-differentiable physics by using a neural proxy and CMA-ES to adjust hardware design, validated on a 3-link, 2-actuator tentacle with real-hardware transfer. Experimental results show improved task performance, sub-millimeter real-world accuracy in some setups, and substantial sim-to-real transfer gains when using closed-loop control. Overall, the work demonstrates that task-based design and policy co-optimization can yield flexible, compact tendon-driven robots that transfer effectively to real hardware.

Abstract

Paper Structure (14 sections, 7 equations, 7 figures, 2 tables)

This paper contains 14 sections, 7 equations, 7 figures, 2 tables.

Introduction
Related Work
Method
Transmission design
Forward actuation model for our transmission design
Task-aware co-optimization of design and control
Experimental Set-up
Design and control co-optimization
Hardware implementation and sim-to-real transfer
Results and Analysis
Co-optimization results
Sim-to-real accuracy
Energy manifold optimization
Conclusion

Figures (7)

Figure 1: We optimize an underactuated, tendon-driven transmission for kinematic chains. We formulate and parameterize a general model for N links and M actuators. We apply our model to a three-link, two-actuator tentacle that we co-optimize using reinforcement learning (top row, left). We then validate our results on physical hardware (right).
Figure 2: Our flexible tendon-transmission design for compliant, underactuated kinematic chains. In this design, $N$ links are driven by $M$ actuated flexion tendons. We also implement a passive extension mechanism, which we can precisely pre-tension. The parameters of our transmission are all flexion tendon radii, extension tendon radii, and elastic tendon pre-elongations.
Figure 3: Illustrations of our optimization-based forward actuation model. (a) and (b) show the global energy landscape (z-axis). The yellow regions are manifolds that satisfy our constraints (Eq.6 and Eq.7). The red arrows in (a) represent optimization steps (Eq.5) that find the energy minimum inside the manifold. Changing design parameters and control actions result in a change in the manifold.
Figure 4: Average episode returns for goal reaching via re-fabrication.
Figure 5: Qualitative results for optimizing underactuated robots to reach different goals in simulation (goal reaching via re-fabrication). (a) shows original design and goal locations. (b) and (c) show two optimized underactuated tentacles reaching different goals.
...and 2 more figures

Task-Based Design and Policy Co-Optimization for Tendon-driven Underactuated Kinematic Chains

TL;DR

Abstract

Task-Based Design and Policy Co-Optimization for Tendon-driven Underactuated Kinematic Chains

Authors

TL;DR

Abstract

Table of Contents

Figures (7)