OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles

Rui Du; Kai Zhao; Jinlong Hou; Qiang Zhang; Peter Zhang

OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles

Rui Du, Kai Zhao, Jinlong Hou, Qiang Zhang, Peter Zhang

TL;DR

This work introduces OPTIMA, a novel distributed reinforcement learning framework for cooperative autonomous vehicle tasks that alternates between thorough data sampling from environmental interactions and multi-agent reinforcement learning algorithms to optimize CAV cooperation, emphasizing both safety and efficiency.

Abstract

Coordination among connected and autonomous vehicles (CAVs) is advancing due to developments in control and communication technologies. However, much of the current work is based on oversimplified and unrealistic task-specific assumptions, which may introduce vulnerabilities. This is critical because CAVs not only interact with their environment but are also integral parts of it. Insufficient exploration can result in policies that carry latent risks, highlighting the need for methods that explore the environment both extensively and efficiently. This work introduces OPTIMA, a novel distributed reinforcement learning framework for cooperative autonomous vehicle tasks. OPTIMA alternates between thorough data sampling from environmental interactions and multi-agent reinforcement learning algorithms to optimize CAV cooperation, emphasizing both safety and efficiency. Our goal is to improve the generality and performance of CAVs in highly complex and crowded scenarios. Furthermore, the industrial-scale distributed training system easily adapts to different algorithms, reward functions, and strategies.

OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles

TL;DR

Abstract

OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)