Table of Contents
Fetching ...

CRISP -- Compliant ROS2 Controllers for Learning-Based Manipulation Policies and Teleoperation

Daniel San José Pro, Oliver Hausdörfer, Ralf Römer, Maximilian Dösch, Martin Schuck, Angela P. Schoellig

TL;DR

CRISP tackles the challenge of executing learning-based manipulation policies that output low-frequency or discontinuous targets by delivering a robot-agnostic, real-time torque-control framework for ROS2. It combines Cartesian impedance and operational-space control with null-space projection, joint barriers, gravity/coriolis and friction compensation, and target-wrench capabilities into a modular stack that can be toggled per task. The system integrates with ROS2 control and Pinocchio, and provides Python and Gymnasium interfaces (CRISP_PY, CRISP_GYM) to streamline data collection and policy deployment across hardware and simulation. Evaluated on hardware (Franka FR3) and in simulation (Kuka IIWA14, Kinova Gen3), CRISP demonstrates accurate tracking, effective teleoperation, and seamless policy execution at real-time frequencies, offering a practical pathway to rapid experimentation with learning-based manipulation. The approach lowers integration barriers and broadens the applicability of learning-based methods to a range of ROS2-compatible manipulators, enabling faster iteration and deployment of perception-free, action-chunk-based policies.

Abstract

Learning-based controllers, such as diffusion policies and vision-language action models, often generate low-frequency or discontinuous robot state changes. Achieving smooth reference tracking requires a low-level controller that converts high-level targets commands into joint torques, enabling compliant behavior during contact interactions. We present CRISP, a lightweight C++ implementation of compliant Cartesian and joint-space controllers for the ROS2 control standard, designed for seamless integration with high-level learning-based policies as well as teleoperation. The controllers are compatible with any manipulator that exposes a joint-torque interface. Through our Python and Gymnasium interfaces, CRISP provides a unified pipeline for recording data from hardware and simulation and deploying high-level learning-based policies seamlessly, facilitating rapid experimentation. The system has been validated on hardware with the Franka Robotics FR3 and in simulation with the Kuka IIWA14 and Kinova Gen3. Designed for rapid integration, flexible deployment, and real-time performance, our implementation provides a unified pipeline for data collection and policy execution, lowering the barrier to applying learning-based methods on ROS2-compatible manipulators. Detailed documentation is available at the project website - https://utiasDSL.github.io/crisp_controllers.

CRISP -- Compliant ROS2 Controllers for Learning-Based Manipulation Policies and Teleoperation

TL;DR

CRISP tackles the challenge of executing learning-based manipulation policies that output low-frequency or discontinuous targets by delivering a robot-agnostic, real-time torque-control framework for ROS2. It combines Cartesian impedance and operational-space control with null-space projection, joint barriers, gravity/coriolis and friction compensation, and target-wrench capabilities into a modular stack that can be toggled per task. The system integrates with ROS2 control and Pinocchio, and provides Python and Gymnasium interfaces (CRISP_PY, CRISP_GYM) to streamline data collection and policy deployment across hardware and simulation. Evaluated on hardware (Franka FR3) and in simulation (Kuka IIWA14, Kinova Gen3), CRISP demonstrates accurate tracking, effective teleoperation, and seamless policy execution at real-time frequencies, offering a practical pathway to rapid experimentation with learning-based manipulation. The approach lowers integration barriers and broadens the applicability of learning-based methods to a range of ROS2-compatible manipulators, enabling faster iteration and deployment of perception-free, action-chunk-based policies.

Abstract

Learning-based controllers, such as diffusion policies and vision-language action models, often generate low-frequency or discontinuous robot state changes. Achieving smooth reference tracking requires a low-level controller that converts high-level targets commands into joint torques, enabling compliant behavior during contact interactions. We present CRISP, a lightweight C++ implementation of compliant Cartesian and joint-space controllers for the ROS2 control standard, designed for seamless integration with high-level learning-based policies as well as teleoperation. The controllers are compatible with any manipulator that exposes a joint-torque interface. Through our Python and Gymnasium interfaces, CRISP provides a unified pipeline for recording data from hardware and simulation and deploying high-level learning-based policies seamlessly, facilitating rapid experimentation. The system has been validated on hardware with the Franka Robotics FR3 and in simulation with the Kuka IIWA14 and Kinova Gen3. Designed for rapid integration, flexible deployment, and real-time performance, our implementation provides a unified pipeline for data collection and policy execution, lowering the barrier to applying learning-based methods on ROS2-compatible manipulators. Detailed documentation is available at the project website - https://utiasDSL.github.io/crisp_controllers.

Paper Structure

This paper contains 15 sections, 8 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: https://github.com/utiasDSL/crisp_controllers architecture overview and integration with robots using ROS2. We provide https://github.com/utiasDSL/crisp_py, an easy-to-use Python interface to the CRISP controllers, as well as Gymnasium towers2024gymnasium environments we use for data collection and policy deployment in https://github.com/utiasDSL/crisp_gym. The user can publish any of the available target commands at arbitrary frequencies via ROS2 topics, and the robot will track the most recent target. High-level learning-based policies such as vision-language-action (VLA) models typically use the target_pose topic. We run the CRISP controllers on a real-time enabled Linux workstation at 1 kHz, while CRISP_GYM with the VLA model runs on a separate workstation in the same network, communicating via ROS2 topics. CRISP_GYM further easily integrates robotic grippers, cameras, and additional sensors required for the task via ROS2 topics.
  • Figure 2: Robot-agnostic Cartesian control [I]: Error evolution for tracking a discontinuous target pose using a Franka Emika FR3 on hardware with CRISP. We set a new target pose at $t=1$ s. For reproducibility, the controller parameters used are available on our website. Note that exact tracking performance depends on controller parameterization, which should be chosen based on the specific task. Left: Position error evolution with steady-state errors of 5.54 mm, 4.73 mm, and 0.81 mm for OSC, CI, and CI-clipped, respectively. Right: Rotational error evolution with steady-state errors of 0.0998 rad, 0.0532 rad, and 0.0029 rad for OSC, CI, and CI-clipped, respectively.
  • Figure 3: Robot agnostic Cartesian control [II]: We demonstrate our controllers on a Franka FR3 on hardware, and on multiple manipulators in simulation. Left: Franka Robotics FR3. Middle: Kinova Gen3. Right: Kuka IIWA 14.
  • Figure 4: Unified data collection and policy deployment [I]: Recording a block-stacking task with Lego using teleoperation and force-torque feedback for the operator. Upper left: Plot of the follower’s target and end-effector positions. Lower left: Feedback wrenches applied at the leader manipulator. Upper right: Image of the recording on the follower manipulator at the current time. Lower right: Image of the operator guiding the leader robot to record data on the follower at the current time.
  • Figure 5: Unified data collection and policy deployment [II]: Using the same recorded data in the CRISP_GYM interface, different learned policies can be successfully deployed in hardware. Left: Diffusion Policy chi2024diffusionpolicy operating at approximately 30 Hz. Right: SmolVLA shukor2025smolvla operating at approximately 10 Hz. Both policies provide end-effector target pose updates that are tracked by our CRISP controllers. Note the different timescales shown in the plots. For policy deployment, we used the CI controllers with CRISP_GYM.