Table of Contents
Fetching ...

Learning-based social coordination to improve safety and robustness of cooperative autonomous vehicles in mixed traffic

Rodolfo Valiente, Behrad Toghi, Mahdi Razzaghpour, Ramtin Pedarsani, Yaser P. Fallah

TL;DR

The paper tackles safety and robustness of cooperative autonomous vehicles in mixed traffic by framing the problem as decentralized multi-agent reinforcement learning with altruistic rewards. It introduces a social utility-based framework that distinguishes sympathy toward HVs from cooperation among AVs, implemented via a 3D-CNN architecture and a safety prioritizer to ensure safety during learning and deployment. Key contributions include a POSG formulation, a decentralized social reward structure, domain adaptation and transfer learning analyses, and empirical demonstrations that altruistic AVs can learn to influence HV behavior to improve overall traffic safety and efficiency. The findings suggest that social coordination among AVs, under diverse HV behaviors and scenarios, yields more robust and societally beneficial outcomes than egoistic driving, informing future development of socially aware autonomous systems.

Abstract

It is expected that autonomous vehicles(AVs) and heterogeneous human-driven vehicles(HVs) will coexist on the same road. The safety and reliability of AVs will depend on their social awareness and their ability to engage in complex social interactions in a socially accepted manner. However, AVs are still inefficient in terms of cooperating with HVs and struggle to understand and adapt to human behavior, which is particularly challenging in mixed autonomy. In a road shared by AVs and HVs, the social preferences or individual traits of HVs are unknown to the AVs and different from AVs, which are expected to follow a policy, HVs are particularly difficult to forecast since they do not necessarily follow a stationary policy. To address these challenges, we frame the mixed-autonomy problem as a multi-agent reinforcement learning (MARL) problem and propose an approach that allows AVs to learn the decision-making of HVs implicitly from experience, account for all vehicles' interests, and safely adapt to other traffic situations. In contrast with existing works, we quantify AVs' social preferences and propose a distributed reward structure that introduces altruism into their decision-making process, allowing the altruistic AVs to learn to establish coalitions and influence the behavior of HVs.

Learning-based social coordination to improve safety and robustness of cooperative autonomous vehicles in mixed traffic

TL;DR

The paper tackles safety and robustness of cooperative autonomous vehicles in mixed traffic by framing the problem as decentralized multi-agent reinforcement learning with altruistic rewards. It introduces a social utility-based framework that distinguishes sympathy toward HVs from cooperation among AVs, implemented via a 3D-CNN architecture and a safety prioritizer to ensure safety during learning and deployment. Key contributions include a POSG formulation, a decentralized social reward structure, domain adaptation and transfer learning analyses, and empirical demonstrations that altruistic AVs can learn to influence HV behavior to improve overall traffic safety and efficiency. The findings suggest that social coordination among AVs, under diverse HV behaviors and scenarios, yields more robust and societally beneficial outcomes than egoistic driving, informing future development of socially aware autonomous systems.

Abstract

It is expected that autonomous vehicles(AVs) and heterogeneous human-driven vehicles(HVs) will coexist on the same road. The safety and reliability of AVs will depend on their social awareness and their ability to engage in complex social interactions in a socially accepted manner. However, AVs are still inefficient in terms of cooperating with HVs and struggle to understand and adapt to human behavior, which is particularly challenging in mixed autonomy. In a road shared by AVs and HVs, the social preferences or individual traits of HVs are unknown to the AVs and different from AVs, which are expected to follow a policy, HVs are particularly difficult to forecast since they do not necessarily follow a stationary policy. To address these challenges, we frame the mixed-autonomy problem as a multi-agent reinforcement learning (MARL) problem and propose an approach that allows AVs to learn the decision-making of HVs implicitly from experience, account for all vehicles' interests, and safely adapt to other traffic situations. In contrast with existing works, we quantify AVs' social preferences and propose a distributed reward structure that introduces altruism into their decision-making process, allowing the altruistic AVs to learn to establish coalitions and influence the behavior of HVs.
Paper Structure (31 sections, 22 equations, 16 figures, 6 tables, 3 algorithms)

This paper contains 31 sections, 22 equations, 16 figures, 6 tables, 3 algorithms.

Figures (16)

  • Figure 1: (a) Interaction of AV-HV to benefit a HV: Altruistic agents create alliances and direct the behavior of HVs to improve traffic flow and prevent dangerous circumstances. AV1 and AV2 can create a formation to guide HV2 and provide a route for HV1, allowing the HV to change lanes and navigate to the exit ramp. (b) Interaction of AV-AV to benefit a HV: The goal of HV1 is to integrate onto the highway. Egoistic AVs disregard the merging vehicle and do not make room for it, possibly resulting in dangerous situations, however, if they exhibit sympathy for the merging HV, they can compromise on their own interest to create a safe path for HV1 to merge into the highway. (c) Interaction of AV-AV to benefit another AV: The goal of AV1 is to exit the highway. If AV2 acts selfishly, AV1 may miss the exit and be unable to complete its task. However, if AV2 and AV3 consider AV1's mission and act altruistically, they can free up space in the platoon by AV2 decelerating and AV3 accelerating, allowing AV1 to safely take the exit.
  • Figure 2: For a seamless and safe highway merging, all AVs must coordinate and account for the utility of HVs. (top) Egoistic AVs optimize only for their own utility, (bottom) Altruistic AVs consider also the HV's utility.
  • Figure 3: Highway exiting and merging scenarios with AVs (green) and aggressive HVs (red) sharing the road. Altruistic AVs must learn to cooperate to exit/merge successfully and safely while being adaptable to a variety of scenarios and HV behaviors.
  • Figure 4: An overview of our approach to leverage social awareness and coordination to improve the safety and reliability of CAVs. Our social-aware AVs learn from scratch not only to drive but also to understand the behavior of HVs and coordinate with them, they learn to adapt and influence HVs in a robust and safe manner.
  • Figure 5: The SVO angle $\phi$ quantifies the level of altruism of an agent. In the figure the diameter of the circles, represents the size of the human population that holds the associated SVO garapin2015does.
  • ...and 11 more figures