Table of Contents
Fetching ...

Path Planning and Reinforcement Learning-Driven Control of On-Orbit Free-Flying Multi-Arm Robots

Álvaro Belmonte-Baeza, José Luis Ramón, Leonard Felicetti, Miguel Cazorla, Jorge Pomares

Abstract

This paper presents a hybrid approach that integrates trajectory optimization (TO) and reinforcement learning (RL) for motion planning and control of free-flying multi-arm robots in on-orbit servicing scenarios. The proposed system integrates TO for generating feasible, efficient paths while accounting for dynamic and kinematic constraints, and RL for adaptive trajectory tracking under uncertainties. The multi-arm robot design, equipped with thrusters for precise body control, enables redundancy and stability in complex space operations. TO optimizes arm motions and thruster forces, reducing reliance on the arms for stabilization and enhancing maneuverability. RL further refines this by leveraging model-free control to adapt to dynamic interactions and disturbances. The experimental results validated through comprehensive simulations demonstrate the effectiveness and robustness of the proposed hybrid approach. Two case studies are explored: surface motion with initial contact and a free-floating scenario requiring surface approximation. In both cases, the hybrid method outperforms traditional strategies. In particular, the thrusters notably enhance motion smoothness, safety, and operational efficiency. The RL policy effectively tracks TO-generated trajectories, handling high-dimensional action spaces and dynamic mismatches. This integration of TO and RL combines the strengths of precise, task-specific planning with robust adaptability, ensuring high performance in the uncertain and dynamic conditions characteristic of space environments. By addressing challenges such as motion coupling, environmental disturbances, and dynamic control requirements, this framework establishes a strong foundation for advancing the autonomy and effectiveness of space robotic systems.

Path Planning and Reinforcement Learning-Driven Control of On-Orbit Free-Flying Multi-Arm Robots

Abstract

This paper presents a hybrid approach that integrates trajectory optimization (TO) and reinforcement learning (RL) for motion planning and control of free-flying multi-arm robots in on-orbit servicing scenarios. The proposed system integrates TO for generating feasible, efficient paths while accounting for dynamic and kinematic constraints, and RL for adaptive trajectory tracking under uncertainties. The multi-arm robot design, equipped with thrusters for precise body control, enables redundancy and stability in complex space operations. TO optimizes arm motions and thruster forces, reducing reliance on the arms for stabilization and enhancing maneuverability. RL further refines this by leveraging model-free control to adapt to dynamic interactions and disturbances. The experimental results validated through comprehensive simulations demonstrate the effectiveness and robustness of the proposed hybrid approach. Two case studies are explored: surface motion with initial contact and a free-floating scenario requiring surface approximation. In both cases, the hybrid method outperforms traditional strategies. In particular, the thrusters notably enhance motion smoothness, safety, and operational efficiency. The RL policy effectively tracks TO-generated trajectories, handling high-dimensional action spaces and dynamic mismatches. This integration of TO and RL combines the strengths of precise, task-specific planning with robust adaptability, ensuring high performance in the uncertain and dynamic conditions characteristic of space environments. By addressing challenges such as motion coupling, environmental disturbances, and dynamic control requirements, this framework establishes a strong foundation for advancing the autonomy and effectiveness of space robotic systems.
Paper Structure (20 sections, 14 equations, 16 figures, 9 tables, 1 algorithm)

This paper contains 20 sections, 14 equations, 16 figures, 9 tables, 1 algorithm.

Figures (16)

  • Figure 1: Overview of the hybrid approach proposed in this paper. TO-based motion planning provides a reference motion for the robot with constraints specific to an on-orbit application, while the RL-driven motion control robustly and adaptively tracks this reference motion.
  • Figure 2: Visualization of the described system architecture and on-orbit operational environment.
  • Figure 3: Overview of the Reinforement Learning-Driven control methodology. The policy network receives as input an observed robot state, and outputs desired joint positions and thruster forces for the robot. These are fed to the physics simulation, yielding a new state. The transition is used to update the network parameters via an RL algorithm of choice, and the loop is repeated until convergence.
  • Figure 4: Multi-arm robot used in our experiments.
  • Figure 5: Position and orientation of the robot base during the trajectory
  • ...and 11 more figures