Table of Contents
Fetching ...

Object-centric Task Representation and Transfer using Diffused Orientation Fields

Cem Bilaloglu, Tobias Löw, Sylvain Calinon

TL;DR

This work tackles online transfer of object-centric tasks on curved objects by introducing Diffused Orientation Fields (DOF), a diffusion-based framework that constructs smoothly varying local reference frames conditioned on object geometry and sparse keypoints. DOF combines surface diffusion on point clouds with workspace diffusion (via Walk on Spheres) to produce orientation fields in $SO(3)$, enabling shape-invariant local actions and modular integration with teleoperation, trajectory optimization, and reinforcement learning. The approach leverages keypoints as inductive cues, supports multiple surface representations, and demonstrates robust transfer for peeling, slicing, and coverage across diverse objects, including deformed pears, under noise and occlusions. The authors provide open-source code and show that the framework improves transferability, robustness, and planning efficiency, while remaining compatible with various control paradigms and scalable to complex scenes.

Abstract

Curved objects pose a fundamental challenge for skill transfer in robotics: unlike planar surfaces, they do not admit a global reference frame. As a result, task-relevant directions such as "toward" or "along" the surface vary with position and geometry, making object-centric tasks difficult to transfer across shapes. To address this, we introduce an approach using Diffused Orientation Fields (DOF), a smooth representation of local reference frames, for transfer learning of tasks across curved objects. By expressing manipulation tasks in these smoothly varying local frames, we reduce the problem of transferring tasks across curved objects to establishing sparse keypoint correspondences. DOF is computed online from raw point cloud data using diffusion processes governed by partial differential equations, conditioned on keypoints. We evaluate DOF under geometric, topological, and localization perturbations, and demonstrate successful transfer of tasks requiring continuous physical interaction such as inspection, slicing, and peeling across varied objects. We provide our open-source codes at our website https://github.com/idiap/diffused_fields_robotics

Object-centric Task Representation and Transfer using Diffused Orientation Fields

TL;DR

This work tackles online transfer of object-centric tasks on curved objects by introducing Diffused Orientation Fields (DOF), a diffusion-based framework that constructs smoothly varying local reference frames conditioned on object geometry and sparse keypoints. DOF combines surface diffusion on point clouds with workspace diffusion (via Walk on Spheres) to produce orientation fields in , enabling shape-invariant local actions and modular integration with teleoperation, trajectory optimization, and reinforcement learning. The approach leverages keypoints as inductive cues, supports multiple surface representations, and demonstrates robust transfer for peeling, slicing, and coverage across diverse objects, including deformed pears, under noise and occlusions. The authors provide open-source code and show that the framework improves transferability, robustness, and planning efficiency, while remaining compatible with various control paradigms and scalable to complex scenes.

Abstract

Curved objects pose a fundamental challenge for skill transfer in robotics: unlike planar surfaces, they do not admit a global reference frame. As a result, task-relevant directions such as "toward" or "along" the surface vary with position and geometry, making object-centric tasks difficult to transfer across shapes. To address this, we introduce an approach using Diffused Orientation Fields (DOF), a smooth representation of local reference frames, for transfer learning of tasks across curved objects. By expressing manipulation tasks in these smoothly varying local frames, we reduce the problem of transferring tasks across curved objects to establishing sparse keypoint correspondences. DOF is computed online from raw point cloud data using diffusion processes governed by partial differential equations, conditioned on keypoints. We evaluate DOF under geometric, topological, and localization perturbations, and demonstrate successful transfer of tasks requiring continuous physical interaction such as inspection, slicing, and peeling across varied objects. We provide our open-source codes at our website https://github.com/idiap/diffused_fields_robotics

Paper Structure

This paper contains 41 sections, 30 equations, 14 figures, 4 tables.

Figures (14)

  • Figure 1: Overview of the proposed representation with complete workflow. We use Spot craneRobustFairingConformal2013 as our canonical point cloud throughout the paper as it provides a sufficiently complex object with a well-defined symmetry axis for cross-section views. (A) Overview of the DOF computation. (i) Our method takes a point cloud collected at runtime and keypoints as its input. (ii) We compute a surface orientation field conditioned on the keypoints by solving the diffusion PDE on the point cloud. We visualize the orientation field by using local reference frames. We show the x-axis in red, the y-axis in green and we omit the z-axis for clarity. (iii) We extend the surface orientation field to any point in the workspace using workspace diffusion. DOF represents smoothly varying local reference frames across the workspace by considering the object's surface geometry. We visualize it as a grid to show its smoothness, but in practice, we evaluate the field only at the query position. (B) Overview of our robot control workflow using local reference frames and actions. High-level controllers query the DOF at the robot position to obtain the local reference frame and produce local actions expressed in that frame for downstream tasks. These local actions are provided as references for the low-level tracking controller.
  • Figure 2: Moive 1: Summary of our method and results. The accompanying video demonstrates our approach across tasks, objects, surface representations, and controllers in the real-world. We also show how DOFs are constructed from point clouds and task-specific keypoints collected at runtime to represent object-centric local reference frames throughout the workspace.
  • Figure 3: Shape-invariant task representation and transfer using DOF. (A) (i) An object-centric trajectory (gray) composed of an approaching phase and a surface-following phase. Small reference frames are the object-centric local reference frames represented by DOF, while the large dashed frame denotes a single body-fixed frame. (ii) Gray trajectory is expressed in object-centric local frames (solid lines) and the single body frame (dashed lines). (B) Local action primitives providing shape-invariant task descriptions in object-centric local reference frames. (C) Simulated visualizations showing the keypoints (blue), tool paths (black), and the x-axis of local frames (red). (D) Real-world experiments showing transfer of slicing, peeling, and tactile coverage tasks to novel objects.
  • Figure 4: Task transfer across objects. (A) Transferred peeling trajectories (red) across 50 pear instances (black). Each trajectory consists of three peeling cycles. (B) Comparison of using multiple discrete body-fixed reference frames versus a field of local reference frames provided by DOF. (i) Visualization of 50 body-fixed reference frames sampled on a pear instance. (ii) Average standard deviation of the transferred action (velocity) trajectories with respect to number of sampled body-fixed reference frames. (C and D) Comparisons of DOF with respect to baselines in terms of the full and period-aligned transferred trajectory statistics, respectively. Action trajectory statistics expressed in (i) Cartesian, (ii) cylindrical, (iii) spherical body-fixed reference frames and (iv) local reference frames provided by DOF.
  • Figure 5: Transfer across controllers using DOF. (A) DOF as a controller agnostic intermediate representation which can be integrated with various reactive and anticipative controllers. (B) Teleoperation using a space mouse and the LEAP hand shawLEAPHandLowCost2023, where the input axes are mapped to local frames. Moving along x-axis (red arrow) slides the tool along the surface, while z-axis (blue arrow) approaches the surface. (C) Trajectory optimization experiments for (i) distance tracking, (ii) target reaching and obstacle avoidance, (iii) reaching without warm-starting and (iv) reaching with warm-starting using the DOF. (v) Norm of the change in the control commands, which is used as the convergence criteria for the trajectory optimization, showing the effect of warm-starting. (D) Learning and transferring policies using reinforcement learning in local reference frames. (i) Reward evolution while learning a reaching policy using local and global reference frames. (ii) Learned target reaching and distance tracking policy in local reference frames of a 2-D circle. Zero-shot transfer of the learned policy on the circle to (iii) a 2-D rectangle and (iv) a 3-D point cloud.
  • ...and 9 more figures