Table of Contents
Fetching ...

COLSON: Controllable Learning-Based Social Navigation via Diffusion-Based Reinforcement Learning

Yuki Tomita, Kohei Matsumoto, Yuki Hyodo, Ryo Kurazume

TL;DR

COLSON presents a diffusion-based reinforcement learning framework for social navigation that leverages a graph neural network encoder and Q-score matching to produce multimodal actions in dynamic pedestrian environments. It introduces post-training guidance mechanisms—SDEdit-based action smoothing and obstacle-avoidance guidance—to adapt to unseen conditions like static obstacles without retraining. The approach demonstrates superior performance against baselines in simulated circle-crossing and pedestrian-density scenarios and validates real-world feasibility on a mobile robot. The work advances diffusion-based policies for mobile robotics and highlights practical benefits for smooth, safe, and scalable navigation in human-centric environments.

Abstract

Mobile robot navigation in dynamic environments with pedestrian traffic is a key challenge in the development of autonomous mobile service robots. Recently, deep reinforcement learning-based methods have been actively studied and have outperformed traditional rule-based approaches owing to their optimization capabilities. Among these, methods that assume a continuous action space typically rely on a Gaussian distribution assumption, which limits the flexibility of generated actions. Meanwhile, the application of diffusion models to reinforcement learning has advanced, allowing for more flexible action distributions compared with Gaussian distribution-based approaches. In this study, we applied a diffusion-based reinforcement learning approach to social navigation and validated its effectiveness. Furthermore, by leveraging the characteristics of diffusion models, we propose an extension that enables post-training action smoothing and adaptation to static obstacle scenarios not considered during the training steps.

COLSON: Controllable Learning-Based Social Navigation via Diffusion-Based Reinforcement Learning

TL;DR

COLSON presents a diffusion-based reinforcement learning framework for social navigation that leverages a graph neural network encoder and Q-score matching to produce multimodal actions in dynamic pedestrian environments. It introduces post-training guidance mechanisms—SDEdit-based action smoothing and obstacle-avoidance guidance—to adapt to unseen conditions like static obstacles without retraining. The approach demonstrates superior performance against baselines in simulated circle-crossing and pedestrian-density scenarios and validates real-world feasibility on a mobile robot. The work advances diffusion-based policies for mobile robotics and highlights practical benefits for smooth, safe, and scalable navigation in human-centric environments.

Abstract

Mobile robot navigation in dynamic environments with pedestrian traffic is a key challenge in the development of autonomous mobile service robots. Recently, deep reinforcement learning-based methods have been actively studied and have outperformed traditional rule-based approaches owing to their optimization capabilities. Among these, methods that assume a continuous action space typically rely on a Gaussian distribution assumption, which limits the flexibility of generated actions. Meanwhile, the application of diffusion models to reinforcement learning has advanced, allowing for more flexible action distributions compared with Gaussian distribution-based approaches. In this study, we applied a diffusion-based reinforcement learning approach to social navigation and validated its effectiveness. Furthermore, by leveraging the characteristics of diffusion models, we propose an extension that enables post-training action smoothing and adaptation to static obstacle scenarios not considered during the training steps.

Paper Structure

This paper contains 17 sections, 1 equation, 6 figures, 4 tables, 3 algorithms.

Figures (6)

  • Figure 1: Conceptual diagram of the proposed method. In standard social navigation, the proposed method leverages the multimodality of diffusion models to generate various actions that allow the robot to avoid pedestrians. The blue dashed arrows represent examples of candidate actions generated for this situation, while the solid orange line indicates the actual action selected (upper window). In environments with static obstacles, actions generated solely by the diffusion model may result in collisions with walls. Therefore, guidance is applied to steer the selection toward alternative candidates that avoid wall collisions (lower window).
  • Figure 2: Architecture of proposed method. $\boldsymbol{s}^r$ and $\boldsymbol{s}^n$ indicate the states of each robot and pedestrian, respectively. $\boldsymbol{h}^n$ indicates the features of each robot and pedestrian. $\boldsymbol{\hat{h}}^n$ indicates the features after the GNN is applied.
  • Figure 3: Comparison of success rate while changing the number of pedestrians. The upper graph shows the results for the visible setting, while the lower graph presents the results for the invisible setting.
  • Figure 4: Trajectory of robot with and without guidance for smoothing. The blue lines represent the trajectory of the robot’s movement, while the light red lines indicate the trajectories of pedestrian movements.
  • Figure 5: Comparison between with and without guidance for static obstacle avoidance. Colored circles represent pedestrian trajectories, and the black rectangles represent walls.
  • ...and 1 more figures