Table of Contents
Fetching ...

Adaptive Social Force Window Planner with Reinforcement Learning

Mauro Martini, Noé Pérez-Higueras, Andrea Ostuni, Marcello Chiaberge, Fernando Caballero, Luis Merino

TL;DR

The paper tackles human-aware navigation by integrating a Social Force Window (SFW) planner with the Dynamic Window Approach (DWA) and empowering it with a Soft Actor-Critic (SAC) based DRL agent to adapt cost weights in context. The core idea is to learn continuous cost-weight vectors that balance social interaction forces, obstacle avoidance, and efficiency, yielding trajectories that are both safe and socially compliant. The authors demonstrate that the SFW-SAC method improves success rates and achieves a favorable trade-off between speed and social constraints across diverse simulated scenarios. This hybrid learning-control approach enhances robustness and generalization for social navigation, with potential implications for real-service robots and broader benchmarking in human-robot interaction contexts.

Abstract

Human-aware navigation is a complex task for mobile robots, requiring an autonomous navigation system capable of achieving efficient path planning together with socially compliant behaviors. Social planners usually add costs or constraints to the objective function, leading to intricate tuning processes or tailoring the solution to the specific social scenario. Machine Learning can enhance planners' versatility and help them learn complex social behaviors from data. This work proposes an adaptive social planner, using a Deep Reinforcement Learning agent to dynamically adjust the weighting parameters of the cost function used to evaluate trajectories. The resulting planner combines the robustness of the classic Dynamic Window Approach, integrated with a social cost based on the Social Force Model, and the flexibility of learning methods to boost the overall performance on social navigation tasks. Our extensive experimentation on different environments demonstrates the general advantage of the proposed method over static cost planners.

Adaptive Social Force Window Planner with Reinforcement Learning

TL;DR

The paper tackles human-aware navigation by integrating a Social Force Window (SFW) planner with the Dynamic Window Approach (DWA) and empowering it with a Soft Actor-Critic (SAC) based DRL agent to adapt cost weights in context. The core idea is to learn continuous cost-weight vectors that balance social interaction forces, obstacle avoidance, and efficiency, yielding trajectories that are both safe and socially compliant. The authors demonstrate that the SFW-SAC method improves success rates and achieves a favorable trade-off between speed and social constraints across diverse simulated scenarios. This hybrid learning-control approach enhances robustness and generalization for social navigation, with potential implications for real-service robots and broader benchmarking in human-robot interaction contexts.

Abstract

Human-aware navigation is a complex task for mobile robots, requiring an autonomous navigation system capable of achieving efficient path planning together with socially compliant behaviors. Social planners usually add costs or constraints to the objective function, leading to intricate tuning processes or tailoring the solution to the specific social scenario. Machine Learning can enhance planners' versatility and help them learn complex social behaviors from data. This work proposes an adaptive social planner, using a Deep Reinforcement Learning agent to dynamically adjust the weighting parameters of the cost function used to evaluate trajectories. The resulting planner combines the robustness of the classic Dynamic Window Approach, integrated with a social cost based on the Social Force Model, and the flexibility of learning methods to boost the overall performance on social navigation tasks. Our extensive experimentation on different environments demonstrates the general advantage of the proposed method over static cost planners.
Paper Structure (14 sections, 3 equations, 4 figures, 2 tables)

This paper contains 14 sections, 3 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The Social Force Window (SFW) Planner combines standard Dynamic Window Approach and Social Force Model. The trajectory scoring process is optimized by a DRL agent that dynamically adjust the cost weights based on local environmental conditions.
  • Figure 2: Workflow of the main step performed by the proposed adaptive Social Force Window (SFW-SAC) Planner with DRL. The policy network learns to set the weights of the social cost used by the DWA for each situation.
  • Figure 3: Schematic of the policy network architecture. State composition is illustrated with separate inputs: goal distance and angle, previous cost weights, people position and velocity, and LiDAR ranges. The new cost weights are predicted as output action of the policy network.
  • Figure 4: Gazebo simulation environments where the agent has been trained (top) and tested (top and bottom). People trajectories are indicated with dotted lines, robot starting poses with a circle, goals with a cross and the associated episode's number.