Trajectory Planning of Robotic Manipulator in Dynamic Environment Exploiting DRL

Osama Ahmad; Zawar Hussain; Hammad Naeem

Trajectory Planning of Robotic Manipulator in Dynamic Environment Exploiting DRL

Osama Ahmad, Zawar Hussain, Hammad Naeem

TL;DR

The paper tackles trajectory planning for a 7-DOF robotic arm in dynamic, unknown environments with moving obstacles, aiming to complete a block-pick task within a fixed time. It employs a Deep Deterministic Policy Gradient (DDPG) actor–critic framework with experience replay and Polyak-updated target networks, comparing sparse and dense reward formulations to shape learning. The study demonstrates that sparse rewards provide faster convergence and higher success in obstacle-rich scenarios, while dense rewards can hinder training under complexity, highlighting a practical design choice for real-time DRL-based planning. The results, obtained on a MuJoCo/gymnasium-based simulation of a 7-DOF fetch arm, have implications for industrial robotics by informing reward shaping and potential future integration with Graph Neural Networks or MPC to improve adaptability in dynamic environments.

Abstract

This study is about the implementation of a reinforcement learning algorithm in the trajectory planning of manipulators. We have a 7-DOF robotic arm to pick and place the randomly placed block at a random target point in an unknown environment. The obstacle is randomly moving which creates a hurdle in picking the object. The objective of the robot is to avoid the obstacle and pick the block with constraints to a fixed timestamp. In this literature, we have applied a deep deterministic policy gradient (DDPG) algorithm and compared the model's efficiency with dense and sparse rewards.

Trajectory Planning of Robotic Manipulator in Dynamic Environment Exploiting DRL

TL;DR

Abstract

Paper Structure (13 sections, 9 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 9 equations, 7 figures, 1 table, 1 algorithm.

Introduction
Methodology
Modeling of Robotic Arm
Formulation of Control Strategies
Optimization Problem
Implementation of DRL based algorithm
Simulation
Experimental Setup
Results and Discussion
Future Study and Work
Conclusion
Notation Definition
HyperParameter

Figures (7)

Figure 1: Basic Diagram of Reinforcement Learning Scheme
Figure 2: Deep Deterministic Framework for Trajectory Planning for Robotic Arm
Figure 3: Robotic environment in OpenAI gym a) no obstacle b) obstacle
Figure 4: Success rate with no obstacle a) sparse reward b) dense reward
Figure 5: Actor loss when an obstacle is moving with a) sparse reward b) dense reward
...and 2 more figures

Trajectory Planning of Robotic Manipulator in Dynamic Environment Exploiting DRL

TL;DR

Abstract

Trajectory Planning of Robotic Manipulator in Dynamic Environment Exploiting DRL

Authors

TL;DR

Abstract

Table of Contents

Figures (7)