Table of Contents
Fetching ...

Implicit Contact Diffuser: Sequential Contact Reasoning with Latent Point Cloud Diffusion

Zixuan Huang, Yinong He, Yating Lin, Dmitry Berenson

TL;DR

The Implicit Contact Diffuser is introduced, a diffusion-based model that generates a sequence of neural descriptors that specify a series of contact relationships between the object and the environment, and this sequence is then used as guidance for an MPC method to accomplish a given task.

Abstract

Long-horizon contact-rich manipulation has long been a challenging problem, as it requires reasoning over both discrete contact modes and continuous object motion. We introduce Implicit Contact Diffuser (ICD), a diffusion-based model that generates a sequence of neural descriptors that specify a series of contact relationships between the object and the environment. This sequence is then used as guidance for an MPC method to accomplish a given task. The key advantage of this approach is that the latent descriptors provide more task-relevant guidance to MPC, helping to avoid local minima for contact-rich manipulation tasks. Our experiments demonstrate that ICD outperforms baselines on complex, long-horizon, contact-rich manipulation tasks, such as cable routing and notebook folding. Additionally, our experiments also indicate that \methodshort can generalize a target contact relationship to a different environment. More visualizations can be found on our website $\href{https://implicit-contact-diffuser.github.io/}{https://implicit-contact-diffuser.github.io}$

Implicit Contact Diffuser: Sequential Contact Reasoning with Latent Point Cloud Diffusion

TL;DR

The Implicit Contact Diffuser is introduced, a diffusion-based model that generates a sequence of neural descriptors that specify a series of contact relationships between the object and the environment, and this sequence is then used as guidance for an MPC method to accomplish a given task.

Abstract

Long-horizon contact-rich manipulation has long been a challenging problem, as it requires reasoning over both discrete contact modes and continuous object motion. We introduce Implicit Contact Diffuser (ICD), a diffusion-based model that generates a sequence of neural descriptors that specify a series of contact relationships between the object and the environment. This sequence is then used as guidance for an MPC method to accomplish a given task. The key advantage of this approach is that the latent descriptors provide more task-relevant guidance to MPC, helping to avoid local minima for contact-rich manipulation tasks. Our experiments demonstrate that ICD outperforms baselines on complex, long-horizon, contact-rich manipulation tasks, such as cable routing and notebook folding. Additionally, our experiments also indicate that \methodshort can generalize a target contact relationship to a different environment. More visualizations can be found on our website

Paper Structure

This paper contains 22 sections, 4 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: By predicting future contact sequences using a latent diffusion model, we enable long-horizon contact-rich deformable object manipulation such as cable routing using a sampling-based MPC controller.
  • Figure 2: System overview, with the notebook folding task as an example. First, ICD transforms the scene, current object, and goal object point cloud, into an implicit contact representation using a modified NDF model. The NDF model can be used to extract point-wise contact relationships of the object, shown by the color. Next, we project the dense NDF point clouds into low-dimensional latent vectors and utilize a latent diffusion model to generate a sequence of contact subgoals. The latent diffusion model generates subgoals recursively from coarse to fine, depending on a reachability measure. Finally, we track these predicted subgoals using a sampling-based MPC method, ensuring that the object reaches the desired contact specification.
  • Figure 3: As shown in the the upper figure, the NDF model is trained to encode local geometries of the scene by predicting occupancy and gradient direction of the Signed Distance Function (SDF) of the scene. Given an object point cloud ${\bm{P}}_o$, such as that of a notebook, we transform it into a contact-aware latent representation ${\bm{P}}_{ndf}$. In the bottom figure, we show how the reachability-aware point cloud VAE is trained. In additional to the regular reconstruction and KL divergence loss, we introduce a distributional reachability prediction loss to encourage temporal consistency in the latent space. The reachability predictor is also used in the latent diffusion model to decide the number of subgoals required for the tasks, as shown in Fig. \ref{['fig:sys1']}.
  • Figure 4: We evaluate our methods on two long-horizon contact-rich tasks in simulation: cable routing and notebook folding. Goals are visualized in red.
  • Figure 5: Physical demonstration with a 7-DoF Kuka arm on cable routing with 3 different cables for a total of 10 runs. Videos are available on our https://implicit-contact-diffuser.github.io/.
  • ...and 2 more figures