Table of Contents
Fetching ...

Physics-Encoded Graph Neural Networks for Deformation Prediction under Contact

Mahdi Saleh, Michael Sommersperger, Nassir Navab, Federico Tombari

TL;DR

This work tackles the problem of predicting soft-object deformation under contact with rigid bodies in robotics. It introduces Physics-Encoded Graph Neural Networks that embed physical state on triangle-mesh graphs for both soft and rigid bodies and use cross-attention to model their interaction, followed by a decoder to reconstruct post-contact deformations. The method defines losses including $L_{mse}$ and a graph-consistency term, with total loss $L_T = L_{mse} + \lambda_G L_G$, enabling joint learning of geometry and physics. A new Everyday Deform dataset and a retina deformation dataset demonstrate accurate and efficient deformation predictions, and code and data are released to support research in robotic simulation and grasping.

Abstract

In robotics, it's crucial to understand object deformation during tactile interactions. A precise understanding of deformation can elevate robotic simulations and have broad implications across different industries. We introduce a method using Physics-Encoded Graph Neural Networks (GNNs) for such predictions. Similar to robotic grasping and manipulation scenarios, we focus on modeling the dynamics between a rigid mesh contacting a deformable mesh under external forces. Our approach represents both the soft body and the rigid body within graph structures, where nodes hold the physical states of the meshes. We also incorporate cross-attention mechanisms to capture the interplay between the objects. By jointly learning geometry and physics, our model reconstructs consistent and detailed deformations. We've made our code and dataset public to advance research in robotic simulation and grasping.

Physics-Encoded Graph Neural Networks for Deformation Prediction under Contact

TL;DR

This work tackles the problem of predicting soft-object deformation under contact with rigid bodies in robotics. It introduces Physics-Encoded Graph Neural Networks that embed physical state on triangle-mesh graphs for both soft and rigid bodies and use cross-attention to model their interaction, followed by a decoder to reconstruct post-contact deformations. The method defines losses including and a graph-consistency term, with total loss , enabling joint learning of geometry and physics. A new Everyday Deform dataset and a retina deformation dataset demonstrate accurate and efficient deformation predictions, and code and data are released to support research in robotic simulation and grasping.

Abstract

In robotics, it's crucial to understand object deformation during tactile interactions. A precise understanding of deformation can elevate robotic simulations and have broad implications across different industries. We introduce a method using Physics-Encoded Graph Neural Networks (GNNs) for such predictions. Similar to robotic grasping and manipulation scenarios, we focus on modeling the dynamics between a rigid mesh contacting a deformable mesh under external forces. Our approach represents both the soft body and the rigid body within graph structures, where nodes hold the physical states of the meshes. We also incorporate cross-attention mechanisms to capture the interplay between the objects. By jointly learning geometry and physics, our model reconstructs consistent and detailed deformations. We've made our code and dataset public to advance research in robotic simulation and grasping.
Paper Structure (23 sections, 3 equations, 6 figures, 5 tables)

This paper contains 23 sections, 3 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Given the input meshes of a soft and rigid body and their contact, we embed physical states into mesh graphs. Our network then processes the graphs to understand their interaction and predicts the deformation of the soft mesh.
  • Figure 2: Our pipeline unfolds in three stages: First, we integrate the physics and positional encoding for both soft and rigid objects. Second, we engage in the encoder phase to extract features for each body. Finally, we employ multi-head attention to facilitate interactions between the features of soft and rigid objects, subsequently leveraging the conditioned feature to decode and reconstruct the deformed mesh or graph.
  • Figure 3: Images from the SynthesEyes simulation framework: On the left, the 3D mesh depicts the retina's deformation, while the right presents a 2D camera projection showcasing the intricate deformation patterns of the retina. We utilize and simulate these deformations for our training.
  • Figure 4: The 'Everyday Deform' dataset features eight commonly deformable objects, each with distinct mesh topologies, densities, and deformation characteristics. Using a simulator, each object undergoes random collisions with rigid bodies. The resultant mesh and associated physics data are recorded and made available to the community.
  • Figure 5: The figure presents our quantitative results calculated from the Everyday Deform dataset. The resting pose is visualized on the right, with the rigid object in its initial position. At the center, we illustrate the deformation prediction from our network. The left images depict the ground truth deformation present in the dataset. Our method captures fine surface deformation rigid motion and even reflects interaction with the ground floor.
  • ...and 1 more figures