Table of Contents
Fetching ...

Generative Graphical Inverse Kinematics

Oliver Limoyo, Filip Marić, Matthew Giamou, Petra Alexson, Ivan Petrović, Jonathan Kelly

TL;DR

GGIK introduces a generative, graph-based IK solver that learns a multimodal distribution of inverse kinematics solutions conditioned on a partial distance-geometry graph of a robot. By encoding robots as distance graphs and employing $E(n)$-equivariant graph neural networks within a conditional variational autoencoder, GGIK generalizes across different manipulators and generates many valid configurations in parallel. Experiments show high accuracy on seen robots, reasonable generalization to unseen robots, and strong multimodality; GGIK also serves as an efficient initializer for local solvers, reducing computation time. The approach offers a versatile, differentiable IK component suitable for end-to-end learning and planning pipelines, with future work toward obstacle-aware distributions and non-coplanar geometries.

Abstract

Quickly and reliably finding accurate inverse kinematics (IK) solutions remains a challenging problem for many robot manipulators. Existing numerical solvers are broadly applicable but typically only produce a single solution and rely on local search techniques to minimize nonconvex objective functions. More recent learning-based approaches that approximate the entire feasible set of solutions have shown promise as a means to generate multiple fast and accurate IK results in parallel. However, existing learning-based techniques have a significant drawback: each robot of interest requires a specialized model that must be trained from scratch. To address this key shortcoming, we propose a novel distance-geometric robot representation coupled with a graph structure that allows us to leverage the sample efficiency of Euclidean equivariant functions and the generalizability of graph neural networks (GNNs). Our approach is generative graphical inverse kinematics (GGIK), the first learned IK solver able to accurately and efficiently produce a large number of diverse solutions in parallel while also displaying the ability to generalize -- a single learned model can be used to produce IK solutions for a variety of different robots. When compared to several other learned IK methods, GGIK provides more accurate solutions with the same amount of data. GGIK can generalize reasonably well to robot manipulators unseen during training. Additionally, GGIK can learn a constrained distribution that encodes joint limits and scales efficiently to larger robots and a high number of sampled solutions. Finally, GGIK can be used to complement local IK solvers by providing reliable initializations for a local optimization process.

Generative Graphical Inverse Kinematics

TL;DR

GGIK introduces a generative, graph-based IK solver that learns a multimodal distribution of inverse kinematics solutions conditioned on a partial distance-geometry graph of a robot. By encoding robots as distance graphs and employing -equivariant graph neural networks within a conditional variational autoencoder, GGIK generalizes across different manipulators and generates many valid configurations in parallel. Experiments show high accuracy on seen robots, reasonable generalization to unseen robots, and strong multimodality; GGIK also serves as an efficient initializer for local solvers, reducing computation time. The approach offers a versatile, differentiable IK component suitable for end-to-end learning and planning pipelines, with future work toward obstacle-aware distributions and non-coplanar geometries.

Abstract

Quickly and reliably finding accurate inverse kinematics (IK) solutions remains a challenging problem for many robot manipulators. Existing numerical solvers are broadly applicable but typically only produce a single solution and rely on local search techniques to minimize nonconvex objective functions. More recent learning-based approaches that approximate the entire feasible set of solutions have shown promise as a means to generate multiple fast and accurate IK results in parallel. However, existing learning-based techniques have a significant drawback: each robot of interest requires a specialized model that must be trained from scratch. To address this key shortcoming, we propose a novel distance-geometric robot representation coupled with a graph structure that allows us to leverage the sample efficiency of Euclidean equivariant functions and the generalizability of graph neural networks (GNNs). Our approach is generative graphical inverse kinematics (GGIK), the first learned IK solver able to accurately and efficiently produce a large number of diverse solutions in parallel while also displaying the ability to generalize -- a single learned model can be used to produce IK solutions for a variety of different robots. When compared to several other learned IK methods, GGIK provides more accurate solutions with the same amount of data. GGIK can generalize reasonably well to robot manipulators unseen during training. Additionally, GGIK can learn a constrained distribution that encodes joint limits and scales efficiently to larger robots and a high number of sampled solutions. Finally, GGIK can be used to complement local IK solvers by providing reliable initializations for a local optimization process.
Paper Structure (34 sections, 15 equations, 9 figures, 5 tables, 2 algorithms)

This paper contains 34 sections, 15 equations, 9 figures, 5 tables, 2 algorithms.

Figures (9)

  • Figure 1: The process of defining an IK problem as an incomplete or partial graph $\widetilde{G}$ of inter-point distances and the associated IK solution as a complete graph $G$. (a) Conventional forward kinematics model parameterized by joint angles and joint rotation axes. (b) The point placement procedure for the distance-based description, first introduced in 2021_Maric_Riemannian_B. Note that the six distances between points associated with pairs of consecutive joints remain constant regardless of the configuration. We annotate two out of six distances to reduce clutter. (c) A structure graph of the robot based on inter-point distances. (d) Addition of distances in red describing the robot end-effector pose using auxiliary points to define the base coordinate system, which completes the graphical IK problem description. All configurations of the robot reaching this end-effector pose will result in a partial graph of distances shown in (c) and (d). (e) The distances in blue that define a specific joint configuration.
  • Figure 2: Our GGIK solver is based on the CVAE framework. $\text{GNN}_{enc}$ encodes a complete graph representation of a manipulator into a latent graph representation and $\text{GNN}_{dec}$ "reconstructs" it. The prior network, $\text{GNN}_{prior}$, encodes the partial graph into a latent embedding that is near the embedding of the full graph. At test time, we decode the latent embedding of a partial graph into a complete graph to generate a solution.
  • Figure 3: Sampled conditional distributions from GGIK for various robotic manipulators produced by a single model. From left to right: KUKA IIWA, Franka Emika Panda, Schunk LWA4D, Schunk LWA4P, and Universal Robots UR10. Note that the end-effector poses are nearly identical in all cases, highlighting kinematic redundancy. Furthermore, the discrete solution sets of the two 6-DOF robots are captured by our model also.
  • Figure 4: Box-and-whiskers plots comparing the accuracy for identical models trained on datasets containing multiple robots distributed over 512,000 and 2,560,000 datapoints, respectively.
  • Figure 5: Pairwise plotting of 1,000 samples from GGIK (using a single model trained on all of the test manipulators) for a single Kuka goal pose. Each sub-plot contains samples on the 2-dimensional torus for joint angles $\theta_i$ and $\theta_j$, leading to a symmetrical pattern (i.e., the upper triangle is a transposition of the lower triangle). The hue of each sample is proportionate to the end-effector's pose error (the sum of the Euclidean distance in metres and the angular distance in radians). For this particular pose, the continuous and orderly curves produced by accurate solutions indicate that GGIK is able to learn a relatively complex distribution over a large, varied solution set. Samples with higher errors appear to happen sporadically.
  • ...and 4 more figures