Table of Contents
Fetching ...

Self-Contained Calibration of an Elastic Humanoid Upper Body Using Only a Head-Mounted RGB Camera

Johannes Tenhumberg, Dominik Winkelbauer, Darius Burschka, Berthold Bäuml

TL;DR

The paper addresses the challenge of accurately calibrating an elastic humanoid upper body when external measurement systems are impractical. It proposes a self-contained calibration workflow that uses only a head-mounted RGB camera, simple end-effector markers, and a marker on a pole, coupled with an elastic forward-kinematics model and a virtual Cartesian noise term in a MAP framework. The key contributions are the end-to-end calibration of the full kinematic chain (including torso elasticity and camera intrinsics), an efficient configuration-collection strategy, and validation showing an average end-effector error of $3.9\mathrm{mm}$ and a worst-case of $9.2\mathrm{mm}$, comparable to external-tracking baselines. This approach enables faster, lab-free calibration with direct relevance to whole-body motion planning and manipulation in humanoid robots.

Abstract

When a humanoid robot performs a manipulation task, it first makes a model of the world using its visual sensors and then plans the motion of its body in this model. For this, precise calibration of the camera parameters and the kinematic tree is needed. Besides the accuracy of the calibrated model, the calibration process should be fast and self-contained, i.e., no external measurement equipment should be used. Therefore, we extend our prior work on calibrating the elastic upper body of DLR's Agile Justin by now using only its internal head-mounted RGB camera. We use simple visual markers at the ends of the kinematic chain and one in front of the robot, mounted on a pole, to get measurements for the whole kinematic tree. To ensure that the task-relevant cartesian error at the end-effectors is minimized, we introduce virtual noise to fit our imperfect robot model so that the pixel error has a higher weight if the marker is further away from the camera. This correction reduces the cartesian error by more than 20%, resulting in a final accuracy of 3.9mm on average and 9.1mm in the worst case. This way, we achieve the same precision as in our previous work, where an external cartesian tracking system was used.

Self-Contained Calibration of an Elastic Humanoid Upper Body Using Only a Head-Mounted RGB Camera

TL;DR

The paper addresses the challenge of accurately calibrating an elastic humanoid upper body when external measurement systems are impractical. It proposes a self-contained calibration workflow that uses only a head-mounted RGB camera, simple end-effector markers, and a marker on a pole, coupled with an elastic forward-kinematics model and a virtual Cartesian noise term in a MAP framework. The key contributions are the end-to-end calibration of the full kinematic chain (including torso elasticity and camera intrinsics), an efficient configuration-collection strategy, and validation showing an average end-effector error of and a worst-case of , comparable to external-tracking baselines. This approach enables faster, lab-free calibration with direct relevance to whole-body motion planning and manipulation in humanoid robots.

Abstract

When a humanoid robot performs a manipulation task, it first makes a model of the world using its visual sensors and then plans the motion of its body in this model. For this, precise calibration of the camera parameters and the kinematic tree is needed. Besides the accuracy of the calibrated model, the calibration process should be fast and self-contained, i.e., no external measurement equipment should be used. Therefore, we extend our prior work on calibrating the elastic upper body of DLR's Agile Justin by now using only its internal head-mounted RGB camera. We use simple visual markers at the ends of the kinematic chain and one in front of the robot, mounted on a pole, to get measurements for the whole kinematic tree. To ensure that the task-relevant cartesian error at the end-effectors is minimized, we introduce virtual noise to fit our imperfect robot model so that the pixel error has a higher weight if the marker is further away from the camera. This correction reduces the cartesian error by more than 20%, resulting in a final accuracy of 3.9mm on average and 9.1mm in the worst case. This way, we achieve the same precision as in our previous work, where an external cartesian tracking system was used.
Paper Structure (11 sections, 11 equations, 9 figures, 1 table)

This paper contains 11 sections, 11 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: DLRs's Agile Justin Bauml2014 collecting measurements using its head-mounted RGB camera for the calibration of its elastic forward kinematics as well as the camera's intrinsic and extrinsic parameters. Only simple markers on both hands as well as the depicted marker mounted on a pole are used. As described in \ref{['sec:Configuration-Selection']}, we select filtered random configurations to identify the robot in its whole work space.
  • Figure 2: Sketch of the calibration setup. The robot collects images of markers on both of its hands and a pole in front of it. The blue chains show how forward kinematics plus camera projection close the measurement loop. Even if the arms are not directly involved in the pole measurements, their mass distribution in different positions influences the torso elasticities.
  • Figure 3: DLRs's Agile Justin collects measurements to calibrate its non-geometric forward kinematics. The images are from the robot's internal RGB camera with a resolution of $640\!\times\!480$, showing examples for the left arm, the right arm, and the pole. The markers' distances to the camera vary between measurements from 0.2 m up to 1.5m. Without a correction (red), the pixel error is uniformly distributed over the distances, leading to more significant cartesian errors for detections further away from the camera as they correspond to a larger area. The correction (blue) counteracts this and improves the cartesian accuracy by 20 %.
  • Figure 4: The probabilistic graph of the calibration problem includes the camera and robot model from \ref{['sec:Robot-Model']}. It describes how the markers pixel coordinates $u$ are computed from the joint configuration $q$ and the model parameters $\Theta$ for each of the $N$ samples. Left (w/o red parts): In the original mapping, the real pixel measurement noise $\eta_u$ is the only source of stochasticity. Right: An additional virtual cartesian noise node is added to compensate for the imperfect (actually deterministic) kinematic model. Left (with red parts): As shown in \ref{['sec:Virtual-Noise']}, the virtual noise can be incorporated into the original model, resulting in an effective pixel noise with a $\tilde{\sigma}_u$ depending on the distance of the marker to the camera ($\propto 1/z^2$).
  • Figure 5: The different marker positions in the image for the left arm, the pole on the floor and the right arm. We move the pan-tilt joints of the robot's neck to get a good coverage of the image over all markers.
  • ...and 4 more figures