Table of Contents
Fetching ...

ViIK: Flow-based Vision Inverse Kinematics Solver with Fusing Collision Checking

Qinglong Meng, Chongkun Xia, Xueqian Wang

TL;DR

A flow-based vision method that can output diverse available configurations by fusing inverse kinematics and collision checking, named Vision Inverse Kinematics solver (ViIK), which uses RGB images as the perception of environments.

Abstract

Inverse Kinematics (IK) is to find the robot's configurations that satisfy the target pose of the end effector. In motion planning, diverse configurations were required in case a feasible trajectory was not found. Meanwhile, collision checking (CC), e.g. Oriented bounding box (OBB), Discrete Oriented Polytope (DOP), and Quickhull \cite{quickhull}, needs to be done for each configuration provided by the IK solver to ensure every goal configuration for motion planning is available. This means the classical IK solver and CC algorithm should be executed repeatedly for every configuration. Thus, the preparation time is long when the required number of goal configurations is large, e.g. motion planning in cluster environments. Moreover, structured maps, which might be difficult to obtain, were required by classical collision-checking algorithms. To sidestep such two issues, we propose a flow-based vision method that can output diverse available configurations by fusing inverse kinematics and collision checking, named Vision Inverse Kinematics solver (ViIK). Moreover, ViIK uses RGB images as the perception of environments. ViIK can output 1000 configurations within 40 ms, and the accuracy is about 3 millimeters and 1.5 degrees. The higher accuracy can be obtained by being refined by the classical IK solver within a few iterations. The self-collision rates can be lower than 2%. The collision-with-env rates can be lower than 10% in most scenes. The code is available at: https://github.com/AdamQLMeng/ViIK.

ViIK: Flow-based Vision Inverse Kinematics Solver with Fusing Collision Checking

TL;DR

A flow-based vision method that can output diverse available configurations by fusing inverse kinematics and collision checking, named Vision Inverse Kinematics solver (ViIK), which uses RGB images as the perception of environments.

Abstract

Inverse Kinematics (IK) is to find the robot's configurations that satisfy the target pose of the end effector. In motion planning, diverse configurations were required in case a feasible trajectory was not found. Meanwhile, collision checking (CC), e.g. Oriented bounding box (OBB), Discrete Oriented Polytope (DOP), and Quickhull \cite{quickhull}, needs to be done for each configuration provided by the IK solver to ensure every goal configuration for motion planning is available. This means the classical IK solver and CC algorithm should be executed repeatedly for every configuration. Thus, the preparation time is long when the required number of goal configurations is large, e.g. motion planning in cluster environments. Moreover, structured maps, which might be difficult to obtain, were required by classical collision-checking algorithms. To sidestep such two issues, we propose a flow-based vision method that can output diverse available configurations by fusing inverse kinematics and collision checking, named Vision Inverse Kinematics solver (ViIK). Moreover, ViIK uses RGB images as the perception of environments. ViIK can output 1000 configurations within 40 ms, and the accuracy is about 3 millimeters and 1.5 degrees. The higher accuracy can be obtained by being refined by the classical IK solver within a few iterations. The self-collision rates can be lower than 2%. The collision-with-env rates can be lower than 10% in most scenes. The code is available at: https://github.com/AdamQLMeng/ViIK.
Paper Structure (18 sections, 15 equations, 6 figures, 4 tables)

This paper contains 18 sections, 15 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The architecture of the classical motion planning workflow. The IK solver generates diverse configurations first, then every configuration is checked for collision. (Red configurations are collision-occurred.) At last, all collision-free configurations are used as goal states for motion planning.
  • Figure 2: The architecture of a Vision Inverse Kinematics solver (ViIK-2) in the generative direction. ViIK is a dual flow-based model, one is for generating configurations, and another is for mapping multi-view images into latent space. ViIK Encoder is to map images into latent space with sequential MBConv blocks first, then MLP is used for fusing images and the target pose.
  • Figure 3: Examples for comparison of ViIK and the classical IK solver (TRAC-IK). (a), and (b) is an example in Env-1. (c), and (d) is an example in Env-3. Env-1 and Env-3 are the hard scenes among the scenes used in this paper, which have collision rates that are higher than 75%. The examples show that the solutions output by ViIK have a lower collision rate compared with TRAC-IK.
  • Figure 4: The runtime of ViIK-2, and TRAC-IK+CC. The tolerance of TRAC-IK is set to $1\times 10^{-3}$, similar to the errors of ViIK. Collision checking in TRAC-IK+CC uses mesh colliders.
  • Figure 5: 10 typical scenes used for evaluating ViIK in this paper which are proposed in benchmark. (Env-1, Env-2), (Env-3, Env-4), (Env-5, Env-6), (Env-7, Env-8), and (Env-9, Env-10) are used for the model, ViIK-2. Env-1 -- Env-5, and Env-6 -- Env-10 are used for the model, ViIK-5. Env-1 -- Env-10 are used for the model, ViIK-10.
  • ...and 1 more figures