Table of Contents
Fetching ...

Realistic pedestrian-driver interaction modelling using multi-agent RL with human perceptual-motor constraints

Yueyang Wang, Mehmet Dogar, Gustav Markkula

TL;DR

This paper tackles realistic pedestrian–driver interactions at unsignalised crossings by introducing a two-agent MARL framework that embeds human perceptual and motor constraints, including gaze-dependent acuity and ballistic motor control. It introduces gaze-aware visual processing, Bayesian perception, and motion-cost penalties, and models inter-individual variability through population-level distributions of non-policy parameters, evaluated on real-world one-to-one crossing data. Among four variants (NC, MC, VC, VMC), the Visual-and-Motor-Constraint (VMC) model achieves the best fit to real trajectories, evidenced by the lowest composite $NLL$ and closest trajectory reproductions, outperforming a behavioural cloning baseline in a data-limited setting. The work demonstrates that integrating both perceptual uncertainty and motor execution constraints improves realism in interactive road-user modelling, with implications for safer autonomous-vehicle planning and evaluation.

Abstract

Modelling pedestrian-driver interactions is critical for understanding human road user behaviour and developing safe autonomous vehicle systems. Existing approaches often rely on rule-based logic, game-theoretic models, or 'black-box' machine learning methods. However, these models typically lack flexibility or overlook the underlying mechanisms, such as sensory and motor constraints, which shape how pedestrians and drivers perceive and act in interactive scenarios. In this study, we propose a multi-agent reinforcement learning (RL) framework that integrates both visual and motor constraints of pedestrian and driver agents. Using a real-world dataset from an unsignalised pedestrian crossing, we evaluate four model variants, one without constraints, two with either motor or visual constraints, and one with both, across behavioural metrics of interaction realism. Results show that the combined model with both visual and motor constraints performs best. Motor constraints lead to smoother movements that resemble human speed adjustments during crossing interactions. The addition of visual constraints introduces perceptual uncertainty and field-of-view limitations, leading the agents to exhibit more cautious and variable behaviour, such as less abrupt deceleration. In this data-limited setting, our model outperforms a supervised behavioural cloning model, demonstrating that our approach can be effective without large training datasets. Finally, our framework accounts for individual differences by modelling parameters controlling the human constraints as population-level distributions, a perspective that has not been explored in previous work on pedestrian-vehicle interaction modelling. Overall, our work demonstrates that multi-agent RL with human constraints is a promising modelling approach for simulating realistic road user interactions.

Realistic pedestrian-driver interaction modelling using multi-agent RL with human perceptual-motor constraints

TL;DR

This paper tackles realistic pedestrian–driver interactions at unsignalised crossings by introducing a two-agent MARL framework that embeds human perceptual and motor constraints, including gaze-dependent acuity and ballistic motor control. It introduces gaze-aware visual processing, Bayesian perception, and motion-cost penalties, and models inter-individual variability through population-level distributions of non-policy parameters, evaluated on real-world one-to-one crossing data. Among four variants (NC, MC, VC, VMC), the Visual-and-Motor-Constraint (VMC) model achieves the best fit to real trajectories, evidenced by the lowest composite and closest trajectory reproductions, outperforming a behavioural cloning baseline in a data-limited setting. The work demonstrates that integrating both perceptual uncertainty and motor execution constraints improves realism in interactive road-user modelling, with implications for safer autonomous-vehicle planning and evaluation.

Abstract

Modelling pedestrian-driver interactions is critical for understanding human road user behaviour and developing safe autonomous vehicle systems. Existing approaches often rely on rule-based logic, game-theoretic models, or 'black-box' machine learning methods. However, these models typically lack flexibility or overlook the underlying mechanisms, such as sensory and motor constraints, which shape how pedestrians and drivers perceive and act in interactive scenarios. In this study, we propose a multi-agent reinforcement learning (RL) framework that integrates both visual and motor constraints of pedestrian and driver agents. Using a real-world dataset from an unsignalised pedestrian crossing, we evaluate four model variants, one without constraints, two with either motor or visual constraints, and one with both, across behavioural metrics of interaction realism. Results show that the combined model with both visual and motor constraints performs best. Motor constraints lead to smoother movements that resemble human speed adjustments during crossing interactions. The addition of visual constraints introduces perceptual uncertainty and field-of-view limitations, leading the agents to exhibit more cautious and variable behaviour, such as less abrupt deceleration. In this data-limited setting, our model outperforms a supervised behavioural cloning model, demonstrating that our approach can be effective without large training datasets. Finally, our framework accounts for individual differences by modelling parameters controlling the human constraints as population-level distributions, a perspective that has not been explored in previous work on pedestrian-vehicle interaction modelling. Overall, our work demonstrates that multi-agent RL with human constraints is a promising modelling approach for simulating realistic road user interactions.

Paper Structure

This paper contains 34 sections, 8 equations, 17 figures, 2 tables.

Figures (17)

  • Figure 1: Real-world pedestrian (yellow) and vehicle (blue) trajectories overlaid on an image of the road layout at the study site. Solid lines represent trajectory segments within the first 6 seconds of the interaction, while dashed lines indicate segments beyond this window. The ‘o’ markers denote the starting points and the ‘×’ markers denote the end points of the 6-second segments. The red dot marks the centre of the zebra crossing, referred to in the text as the crossing point. A zoomed-out view of the full scene is shown in the bottom right for context. The compass rose at the top left indicates the cardinal directions.
  • Figure 2: Overview of the two-stage sampling and conditioning scheme for non-policy parameters. Population-level means and standard deviations $(\mu, \sigma)$ are defined for $(\nu_\mathrm{ped}, w_\mathrm{ped}, \nu_\mathrm{veh}, w_\mathrm{veh})$. At the start of each episode, agent-specific values are sampled from these distributions. Each RL policy ($\pi_\mathrm{ped}$, $\pi_\mathrm{veh}$) is conditioned on its own parameters and on the population-level $(\mu, \sigma)$ of the other agent. Each observation $o$ also includes this information for conditioning.
  • Figure 3: Comparison of model variants. Arrows follow the reinforcement learning loop: $s$ is the true environment state, $a$ is the agent action (including gaze direction in the VC and VMC models), $r$ is the received reward, and $o$ is the observation fed to the RL policy. In the MC and VMC models, the action $a$ contributes to the reward only for the pedestrian agent (walking effort penalty), as indicated by the 'Ped only' label.
  • Figure 4: Reward plot.
  • Figure 5: Training loss curves for the BC model.
  • ...and 12 more figures