Table of Contents
Fetching ...

Causality-Based Reinforcement Learning Method for Multi-Stage Robotic Tasks

Jiechao Deng, Ning Tan

TL;DR

This work tackles the difficulty of learning multi-stage robotic tasks with reinforcement learning by introducing causality as a core design principle. It automatically discovers stage-specific causal relationships between actions and rewards to construct a minimal, non-redundant action space, and integrates these relations using a causal policy gradient to reduce gradient variance. The proposed framework comprises automated causal matrix discovery (data collection, reward prediction, and KL-based causal identification) followed by training with the learned causal structures, evaluated on mobile and pure manipulation tasks with Fetch in iGibson. Results demonstrate that using causal actions and causal gradients yields faster, more stable learning and fewer progress reversals than baselines relying on full action spaces or manually specified causality. This approach advances practical RL for complex, multi-stage robotic tasks by combining data-driven causal discovery with targeted policy optimization.

Abstract

Deep reinforcement learning has made significant strides in various robotic tasks. However, employing deep reinforcement learning methods to tackle multi-stage tasks still a challenge. Reinforcement learning algorithms often encounter issues such as redundant exploration, getting stuck in dead ends, and progress reversal in multi-stage tasks. To address this, we propose a method that integrates causal relationships with reinforcement learning for multi-stage tasks. Our approach enables robots to automatically discover the causal relationships between their actions and the rewards of the tasks and constructs the action space using only causal actions, thereby reducing redundant exploration and progress reversal. By integrating correct causal relationships using the causal policy gradient method into the learning process, our approach can enhance the performance of reinforcement learning algorithms in multi-stage robotic tasks.

Causality-Based Reinforcement Learning Method for Multi-Stage Robotic Tasks

TL;DR

This work tackles the difficulty of learning multi-stage robotic tasks with reinforcement learning by introducing causality as a core design principle. It automatically discovers stage-specific causal relationships between actions and rewards to construct a minimal, non-redundant action space, and integrates these relations using a causal policy gradient to reduce gradient variance. The proposed framework comprises automated causal matrix discovery (data collection, reward prediction, and KL-based causal identification) followed by training with the learned causal structures, evaluated on mobile and pure manipulation tasks with Fetch in iGibson. Results demonstrate that using causal actions and causal gradients yields faster, more stable learning and fewer progress reversals than baselines relying on full action spaces or manually specified causality. This approach advances practical RL for complex, multi-stage robotic tasks by combining data-driven causal discovery with targeted policy optimization.

Abstract

Deep reinforcement learning has made significant strides in various robotic tasks. However, employing deep reinforcement learning methods to tackle multi-stage tasks still a challenge. Reinforcement learning algorithms often encounter issues such as redundant exploration, getting stuck in dead ends, and progress reversal in multi-stage tasks. To address this, we propose a method that integrates causal relationships with reinforcement learning for multi-stage tasks. Our approach enables robots to automatically discover the causal relationships between their actions and the rewards of the tasks and constructs the action space using only causal actions, thereby reducing redundant exploration and progress reversal. By integrating correct causal relationships using the causal policy gradient method into the learning process, our approach can enhance the performance of reinforcement learning algorithms in multi-stage robotic tasks.

Paper Structure

This paper contains 24 sections, 4 equations, 13 figures, 4 tables, 3 algorithms.

Figures (13)

  • Figure 1: During normal execution, the end effector should be adjusted to face downwards, but due to the movement of the robotic arm, the position of the end effector deviated, causing the entire task to regress from step 2 back to step 1, resulting in a progress reversal.
  • Figure 2: Two Action Selection Schemes
  • Figure 3: Breaking down multi-stage tasks into multiple subtasks, each subtask discovers the causal relationships of the current stage, which are used to construct the action space of the current policy and are integrated into the learning process using the causal policy gradient method.
  • Figure 4: Procedure for Calculating the Difference for Each Pair($a_k$,$r_j^i$)
  • Figure 5: Mobile Manipulation Task
  • ...and 8 more figures