Table of Contents
Fetching ...

Social Coordination and Altruism in Autonomous Driving

Behrad Toghi, Rodolfo Valiente, Dorsa Sadigh, Ramtin Pedarsani, Yaser P. Fallah

TL;DR

The paper tackles safety and efficiency in mixed-autonomy traffic by modeling AV-HV decision-making as a partially observable stochastic game and training altruistic autonomous agents via deep multi-agent reinforcement learning. It introduces a decentralized reward framework that splits altruism into sympathy (toward HVs) and cooperation (among AVs) and optimizes a social value orientation angle to balance self and group utilities. Through a highway merging case study, the authors show that altruistic AVs can form alliances, influence HV behavior, and substantially improve merging success, traffic flow, and safety, while a purely egoistic approach underperforms. The work also proposes a semi-sequential training paradigm to mitigate non-stationarity and demonstrates robustness to different human-driver models, highlighting the practical potential and limitations of deploying altruistic coordination in real-world traffic.

Abstract

Despite the advances in the autonomous driving domain, autonomous vehicles (AVs) are still inefficient and limited in terms of cooperating with each other or coordinating with vehicles operated by humans. A group of autonomous and human-driven vehicles (HVs) which work together to optimize an altruistic social utility -- as opposed to the egoistic individual utility -- can co-exist seamlessly and assure safety and efficiency on the road. Achieving this mission without explicit coordination among agents is challenging, mainly due to the difficulty of predicting the behavior of humans with heterogeneous preferences in mixed-autonomy environments. Formally, we model an AV's maneuver planning in mixed-autonomy traffic as a partially-observable stochastic game and attempt to derive optimal policies that lead to socially-desirable outcomes using a multi-agent reinforcement learning framework. We introduce a quantitative representation of the AVs' social preferences and design a distributed reward structure that induces altruism into their decision making process. Our altruistic AVs are able to form alliances, guide the traffic, and affect the behavior of the HVs to handle competitive driving scenarios. As a case study, we compare egoistic AVs to our altruistic autonomous agents in a highway merging setting and demonstrate the emerging behaviors that lead to a noticeable improvement in the number of successful merges as well as the overall traffic flow and safety.

Social Coordination and Altruism in Autonomous Driving

TL;DR

The paper tackles safety and efficiency in mixed-autonomy traffic by modeling AV-HV decision-making as a partially observable stochastic game and training altruistic autonomous agents via deep multi-agent reinforcement learning. It introduces a decentralized reward framework that splits altruism into sympathy (toward HVs) and cooperation (among AVs) and optimizes a social value orientation angle to balance self and group utilities. Through a highway merging case study, the authors show that altruistic AVs can form alliances, influence HV behavior, and substantially improve merging success, traffic flow, and safety, while a purely egoistic approach underperforms. The work also proposes a semi-sequential training paradigm to mitigate non-stationarity and demonstrates robustness to different human-driver models, highlighting the practical potential and limitations of deploying altruistic coordination in real-world traffic.

Abstract

Despite the advances in the autonomous driving domain, autonomous vehicles (AVs) are still inefficient and limited in terms of cooperating with each other or coordinating with vehicles operated by humans. A group of autonomous and human-driven vehicles (HVs) which work together to optimize an altruistic social utility -- as opposed to the egoistic individual utility -- can co-exist seamlessly and assure safety and efficiency on the road. Achieving this mission without explicit coordination among agents is challenging, mainly due to the difficulty of predicting the behavior of humans with heterogeneous preferences in mixed-autonomy environments. Formally, we model an AV's maneuver planning in mixed-autonomy traffic as a partially-observable stochastic game and attempt to derive optimal policies that lead to socially-desirable outcomes using a multi-agent reinforcement learning framework. We introduce a quantitative representation of the AVs' social preferences and design a distributed reward structure that induces altruism into their decision making process. Our altruistic AVs are able to form alliances, guide the traffic, and affect the behavior of the HVs to handle competitive driving scenarios. As a case study, we compare egoistic AVs to our altruistic autonomous agents in a highway merging setting and demonstrate the emerging behaviors that lead to a noticeable improvement in the number of successful merges as well as the overall traffic flow and safety.

Paper Structure

This paper contains 22 sections, 21 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: (a) AV-HV interaction to benefit another HV: Altruistic agents have the opportunity to form alliances and guide the behavior of HVs in order to improve the traffic flow and avoid hazardous situations. AV1 & AV2 can build a formation to slow down HV2 and open up a pathway for HV1, enabling it to trust the AVs, change lanes, and navigate towards the exit ramp. (b) AV-AV interaction to benefit another HV: HV1 is intended to merge into the highway. Egoistic AVs ignore the merging vehicle and do not open up space for it which can potentially lead to hazardous scenarios, whereas if they show sympathy for the merging HV, they can compromise on their own interest in order to create a safe path for HV1 to merge into the highway. (c) AV-AV interaction to benefit another AV: AV1 attempts to exit the highway. If AV2-AV5 act egoistically, AV1 might miss the exit and not be able to follow its planned mission. However, if AV2-AV5 take into account the interest of AV1 and act altruistically, they can open up space in the platoon, by AV2 & AV3 decelerating and AV4 & AV5 accelerating, to enable a safe exit for AV1.
  • Figure 2: SVO angular phase $\phi$ quantifies an agent's level of altruism. Figure is based on the empirical data collected from humans by Garapin et al. garapin2015does. Diameter of the circles show the size of the human population that hold the corresponding SVO.
  • Figure 3: Case study: a mission vehicle that can be human-driven or autonomous attempts to merge into the mixed group of AVs and HVs.
  • Figure 4: Multi-agent training and policy dissemination process.
  • Figure 5: Impact of sympathy and cooperation elements in traffic safety and success of the merging maneuver for both $M\in\mathcal{I}$ or $M\in\mathcal{V}$. Hatched bars show the number of independent crashes that do not involve the mission vehicle.
  • ...and 6 more figures