IKDiffuser: a Diffusion-based Generative Inverse Kinematics Solver for Kinematic Trees

Zeyu Zhang; Ziyuan Jiao

IKDiffuser: a Diffusion-based Generative Inverse Kinematics Solver for Kinematic Trees

Zeyu Zhang, Ziyuan Jiao

TL;DR

IKDiffuser addresses the challenging inverse kinematics problem for arbitrary kinematic trees by formulating IK as probabilistic diffusion with a structure-agnostic, token-based representation of end-effector goals. It learns a generative prior over joint configurations conditioned on end-effector poses and enables inference-time task guidance via objective-guided sampling, along with masked marginal inference to support partially specified goals. The framework supports task-specific objectives such as warm-start initialization and manipulability maximization without retraining, and it can seed optimization-based IK solvers to greatly boost success rates while delivering millisecond latency. Extensive experiments across eight robotic platforms demonstrate superior accuracy, diversity, and collision avoidance compared to baselines, and show dramatic improvements in seeding optimization-based solvers for high-DoF systems. The work offers a scalable, adaptable primitive for real-time planning and control in complex robots, enabling flexible task specifications without sacrificing precision.

Abstract

Solving Inverse Kinematics (IK) for arbitrary kinematic trees presents significant challenges due to their high-dimensionality, redundancy, and complex inter-branch constraints. Conventional optimization-based solvers can be sensitive to initialization and suffer from local minima or conflicting gradients. At the same time, existing learning-based approaches are often tied to a predefined number of end-effectors and a fixed training objective, limiting their reusability across various robot morphologies and task requirements. To address these limitations, we introduce IKDiffuser, a scalable IK solver built upon conditional diffusion-based generative models, which learns the distribution of the configuration space conditioned on end-effector poses. We propose a structure-agnostic formulation that represents end-effector poses as a sequence of tokens, leading to a unified framework that handles varying numbers of end-effectors while learning the implicit kinematic structures entirely from data. Beyond standard IK generation, IKDiffuser handles partially specified goals via a masked marginalization mechanism that conditions only on a subset of end-effector constraints. Furthermore, it supports adding task objectives at inference through objective-guided sampling, enabling capabilities such as warm-start initialization and manipulability maximization without retraining. Extensive evaluations across seven diverse robotic platforms demonstrate that IKDiffuser significantly outperforms state-of-the-art baselines in accuracy, solution diversity, and collision avoidance. Moreover, when used to initialize optimization-based solvers, IKDiffuser significantly boosts success rates on challenging redundant systems with high Degrees of Freedom (DoF), such as the 29-DoF Unitree G1 humanoid, from 21.01% to 96.96% while reducing computation time to the millisecond range.

IKDiffuser: a Diffusion-based Generative Inverse Kinematics Solver for Kinematic Trees

TL;DR

Abstract

IKDiffuser: a Diffusion-based Generative Inverse Kinematics Solver for Kinematic Trees

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)