Table of Contents
Fetching ...

Robust Contact-rich Manipulation through Implicit Motor Adaptation

Teng Xue, Amirreza Razmjoo, Suhan Shetty, Sylvain Calinon

TL;DR

This work tackles the challenge of robust contact-rich manipulation under uncertain physical parameters by introducing implicit motor adaptation (IMA), which retrieves parameter-conditioned policies from a probabilistic parameter distribution using tensor-train (TT) representations. Unlike explicit motor adaptation (EMA), IMA avoids precise system identification and retraining by leveraging domain contraction to combine online parameter uncertainty with TT-based policy retrieval. The authors provide theoretical analysis showing IMA's advantages over EMA, and demonstrate through simulation and real-robot planar push experiments that IMA yields robust, instance-aware behaviors across diverse objects and disturbances. The approach offers a practical pathway to robust sim-to-real transfer in manipulation tasks, with potential extensions to diffusion-based parameter estimation and hybrid TT-neural architectures.

Abstract

Contact-rich manipulation plays an important role in daily human activities. However, uncertain physical parameters often pose significant challenges for both planning and control. A promising strategy is to develop policies that are robust across a wide range of parameters. Domain adaptation and domain randomization are widely used, but they tend to either limit generalization to new instances or perform conservatively due to neglecting instance-specific information. \textit{Explicit motor adaptation} addresses these issues by estimating system parameters online and then retrieving the parameter-conditioned policy from a parameter-augmented base policy. However, it typically requires precise system identification or additional training of a student policy, both of which are challenging in contact-rich manipulation tasks with diverse physical parameters. In this work, we propose \textit{implicit motor adaptation}, which enables parameter-conditioned policy retrieval given a roughly estimated parameter distribution instead of a single estimate. We leverage tensor train as an implicit representation of the base policy, facilitating efficient retrieval of the parameter-conditioned policy by exploiting the separable structure of tensor cores. This framework eliminates the need for precise system estimation and policy retraining while preserving optimal behavior and strong generalization. We provide a theoretical analysis to validate the approach, supported by numerical evaluations on three contact-rich manipulation primitives. Both simulation and real-world experiments demonstrate its ability to generate robust policies across diverse instances. Project website: \href{https://sites.google.com/view/implicit-ma}{https://sites.google.com/view/implicit-ma}.

Robust Contact-rich Manipulation through Implicit Motor Adaptation

TL;DR

This work tackles the challenge of robust contact-rich manipulation under uncertain physical parameters by introducing implicit motor adaptation (IMA), which retrieves parameter-conditioned policies from a probabilistic parameter distribution using tensor-train (TT) representations. Unlike explicit motor adaptation (EMA), IMA avoids precise system identification and retraining by leveraging domain contraction to combine online parameter uncertainty with TT-based policy retrieval. The authors provide theoretical analysis showing IMA's advantages over EMA, and demonstrate through simulation and real-robot planar push experiments that IMA yields robust, instance-aware behaviors across diverse objects and disturbances. The approach offers a practical pathway to robust sim-to-real transfer in manipulation tasks, with potential extensions to diffusion-based parameter estimation and hybrid TT-neural architectures.

Abstract

Contact-rich manipulation plays an important role in daily human activities. However, uncertain physical parameters often pose significant challenges for both planning and control. A promising strategy is to develop policies that are robust across a wide range of parameters. Domain adaptation and domain randomization are widely used, but they tend to either limit generalization to new instances or perform conservatively due to neglecting instance-specific information. \textit{Explicit motor adaptation} addresses these issues by estimating system parameters online and then retrieving the parameter-conditioned policy from a parameter-augmented base policy. However, it typically requires precise system identification or additional training of a student policy, both of which are challenging in contact-rich manipulation tasks with diverse physical parameters. In this work, we propose \textit{implicit motor adaptation}, which enables parameter-conditioned policy retrieval given a roughly estimated parameter distribution instead of a single estimate. We leverage tensor train as an implicit representation of the base policy, facilitating efficient retrieval of the parameter-conditioned policy by exploiting the separable structure of tensor cores. This framework eliminates the need for precise system estimation and policy retraining while preserving optimal behavior and strong generalization. We provide a theoretical analysis to validate the approach, supported by numerical evaluations on three contact-rich manipulation primitives. Both simulation and real-world experiments demonstrate its ability to generate robust policies across diverse instances. Project website: \href{https://sites.google.com/view/implicit-ma}{https://sites.google.com/view/implicit-ma}.

Paper Structure

This paper contains 26 sections, 2 theorems, 25 equations, 15 figures, 3 tables.

Key Result

Proposition 1

Given the estimated parameter distribution $P(\hat{\bm{\alpha}})$, the parameter-conditioned policy can be retrieved from the weighted sum of parameter-specific advantage functions.

Figures (15)

  • Figure 1: Deployment of the learned policy in a variety of contact-rich manipulation tasks. The left images illustrate the state and action spaces for each primitive (Hit, Push, Reorientation) in black and blue, respectively, along with the parameter spaces represented by the red variables: $m$ for mass, $\mu$ for the friction coefficient, $r$ for the radius, and $l$ for the length. A single policy is trained for each primitive and deployed directly across a wide range of objects with varying shapes, weights, and friction parameters, while preserving instance-specific optimal behaviors.
  • Figure 2: TT decomposition generalizes matrix decomposition techniques to higher-dimensional arrays. In TT format, an element in a tensor can be obtained by multiplying specific slices of the core tensors. The figure presents examples of third-order, and fourth-order tensors. Image adapted from shetty2016tensor.
  • Figure 3: Pipeline of the proposed approach, including (1) parameter-augmented base policy learning, (2) probabilistic system adaptation with proprioceptive history, and (3) parameter-conditioned policy retrieval. The base policy and parameter-conditioned policy are implicitly represented by the corresponding advantage functions $A(\bm{\alpha}, \bm{x}, \bm{u})$ and $\hat{A}(\bm{x}, \bm{u})$, respectively. Blue-shaded modules are trained in simulation, and green-shaded ones are used in deployment, with the probabilistic system adaptation module bridging both stages.
  • Figure 4: Domain contraction in TT format. The parameter-augmented advantage function in TT format typically includes separate 3rd-order cores for different dimensionality, such as parameter, state and action. Given a probabilistic parameter distribution, we can retrieve the parameter-conditioned policy by making product of parameter distributions and corresponding TT cores.
  • Figure 5: Domain contraction unifies domain randomization and domain adaptation by giving different parameter distributions.
  • ...and 10 more figures

Theorems & Definitions (3)

  • Proposition 1
  • Proposition 2
  • Example 1