Table of Contents
Fetching ...

Data-efficient, Explainable and Safe Box Manipulation: Illustrating the Advantages of Physical Priors in Model-Predictive Control

Achkan Salehi, Stephane Doncieux

TL;DR

This work tackles data efficiency, explainability, and safety in robotics control by injecting physical priors into Model Predictive Control for a box-rotation task on the SOTO2 conveyor gripper. It replaces a black-box environment model with a gray-box that learns a voxel-based mass distribution $\hat{\Pi}$ from a short exploration, then computes center of mass $r_c$ and inertia $I_c$ to plan via MPC using the dynamics $M\ddot{r}_c=F$ and $\dot{\omega}=I_c^{-1}\tau$, with torque estimated from contact surfaces. The approach yields zero-shot generalization to unseen mass distributions, improved data-efficiency, and built-in safety via early aborts, outperforming a black-box baseline in safety-critical scenarios. Limitations include supervision needs for $\hat{\Pi}$, sim-to-real transfer, and manual priors, suggesting directions such as domain randomization and automated prior extraction for broader applicability.

Abstract

Model-based RL/control have gained significant traction in robotics. Yet, these approaches often remain data-inefficient and lack the explainability of hand-engineered solutions. This makes them difficult to debug/integrate in safety-critical settings. However, in many systems, prior knowledge of environment kinematics/dynamics is available. Incorporating such priors can help address the aforementioned problems by reducing problem complexity and the need for exploration, while also facilitating the expression of the decisions taken by the agent in terms of physically meaningful entities. Our aim with this paper is to illustrate and support this point of view via a case-study. We model a payload manipulation problem based on a real robotic system, and show that leveraging prior knowledge about the dynamics of the environment in an MPC framework can lead to improvements in explainability, safety and data-efficiency, leading to satisfying generalization properties with less data.

Data-efficient, Explainable and Safe Box Manipulation: Illustrating the Advantages of Physical Priors in Model-Predictive Control

TL;DR

This work tackles data efficiency, explainability, and safety in robotics control by injecting physical priors into Model Predictive Control for a box-rotation task on the SOTO2 conveyor gripper. It replaces a black-box environment model with a gray-box that learns a voxel-based mass distribution from a short exploration, then computes center of mass and inertia to plan via MPC using the dynamics and , with torque estimated from contact surfaces. The approach yields zero-shot generalization to unseen mass distributions, improved data-efficiency, and built-in safety via early aborts, outperforming a black-box baseline in safety-critical scenarios. Limitations include supervision needs for , sim-to-real transfer, and manual priors, suggesting directions such as domain randomization and automated prior extraction for broader applicability.

Abstract

Model-based RL/control have gained significant traction in robotics. Yet, these approaches often remain data-inefficient and lack the explainability of hand-engineered solutions. This makes them difficult to debug/integrate in safety-critical settings. However, in many systems, prior knowledge of environment kinematics/dynamics is available. Incorporating such priors can help address the aforementioned problems by reducing problem complexity and the need for exploration, while also facilitating the expression of the decisions taken by the agent in terms of physically meaningful entities. Our aim with this paper is to illustrate and support this point of view via a case-study. We model a payload manipulation problem based on a real robotic system, and show that leveraging prior knowledge about the dynamics of the environment in an MPC framework can lead to improvements in explainability, safety and data-efficiency, leading to satisfying generalization properties with less data.
Paper Structure (11 sections, 14 equations, 4 figures, 2 tables)

This paper contains 11 sections, 14 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: (a) The SOTO2 robot (left) and its gripper (right), which is composed of two conveyor belts. The motion of each belt cover is set via independent velocity controls, and the distance between the two can be adjusted via position control. The aim of a control episode is to rotate boxes of varying and unknown mass distributions by $\frac{\pi}{2}$ in a safe manner, i.e. the box should either remain well balanced on the conveyors, or manipulation should safely be aborted. (b) Our pybullet simulation of the gripper, based on roller conveyors. (c, top) Initial conditions in each episode. (c, bottom) Target pose. (d) Example final state from a safe rotation, where the box stays well balanced. Depending on the mass distribution and the center of mass, this can be difficult to achieve, leading to unsafe rotation as shown in (e).
  • Figure 2: (a) Top view diagram of the two conveyor belts with illustrative force vectors $F_i$. The shaded volumes, which are those that rest on the conveyor belts, are used to compute torque, friction and the velocity control to force mapping. They are denoted $S_1$ (green) and $S_2$ (red) in the text. (b) Top view illustration of the conveyors with a box on top of them, along with the referential $x,y$ attached to the conveyors. Also shown are the different position and velocity controls. The position controls $\delta p^+_i,\delta p^-_i$ can move the conveyors closer or apart along the $y$ axis, and the velocity controls $\delta v^+_i,\delta v^-_i$ move the surface of the conveyors up/down along the $x$ axis. The action space is thus in $\mathbb{R}^4$. (c,d) Example of ground truth and estimated mass distributions. Each distribution is approximated by a binary occupancy grid and a total mass value $M$, equally distributed between the occupied cells. In our implementation, an infinitesimal positive value is added to the mass of each voxel, in order to avoid degenerate cases with zero friction at contact points. The predicted mass distribution is used to compute the moments of inertia matrix and the center of mass.
  • Figure 3: (a,b) Mappings obtained between applied velocity controls to the different $F_i$. (c) Balance score obtained on the $30$ non-hazardous mass distributions. The shaded area corresponds to one standard deviation around the median. Note that the maximum error (dashed red lines) is caused by an inaccurate prediction from $\hat{\Pi}$.
  • Figure 4: Mass distributions (referred to as Distributions A,B,C,D) used in the first set of experiments.