Table of Contents
Fetching ...

QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation

Yuxin Li, Yiheng Li, Xulei Yang, Mengying Yu, Zihang Huang, Xiaojun Wu, Chai Kiat Yeo

TL;DR

This work proposes QuadBEV, an efficient multitask perception framework that leverages the shared spatial and contextual information across four key tasks: 3D object detection, lane detection, map segmentation, and occupancy prediction, and reduces redundant computations, thereby enhancing system efficiency.

Abstract

Bird's-Eye-View (BEV) perception has become a vital component of autonomous driving systems due to its ability to integrate multiple sensor inputs into a unified representation, enhancing performance in various downstream tasks. However, the computational demands of BEV models pose challenges for real-world deployment in vehicles with limited resources. To address these limitations, we propose QuadBEV, an efficient multitask perception framework that leverages the shared spatial and contextual information across four key tasks: 3D object detection, lane detection, map segmentation, and occupancy prediction. QuadBEV not only streamlines the integration of these tasks using a shared backbone and task-specific heads but also addresses common multitask learning challenges such as learning rate sensitivity and conflicting task objectives. Our framework reduces redundant computations, thereby enhancing system efficiency, making it particularly suited for embedded systems. We present comprehensive experiments that validate the effectiveness and robustness of QuadBEV, demonstrating its suitability for real-world applications.

QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation

TL;DR

This work proposes QuadBEV, an efficient multitask perception framework that leverages the shared spatial and contextual information across four key tasks: 3D object detection, lane detection, map segmentation, and occupancy prediction, and reduces redundant computations, thereby enhancing system efficiency.

Abstract

Bird's-Eye-View (BEV) perception has become a vital component of autonomous driving systems due to its ability to integrate multiple sensor inputs into a unified representation, enhancing performance in various downstream tasks. However, the computational demands of BEV models pose challenges for real-world deployment in vehicles with limited resources. To address these limitations, we propose QuadBEV, an efficient multitask perception framework that leverages the shared spatial and contextual information across four key tasks: 3D object detection, lane detection, map segmentation, and occupancy prediction. QuadBEV not only streamlines the integration of these tasks using a shared backbone and task-specific heads but also addresses common multitask learning challenges such as learning rate sensitivity and conflicting task objectives. Our framework reduces redundant computations, thereby enhancing system efficiency, making it particularly suited for embedded systems. We present comprehensive experiments that validate the effectiveness and robustness of QuadBEV, demonstrating its suitability for real-world applications.
Paper Structure (15 sections, 3 equations, 3 figures, 6 tables)

This paper contains 15 sections, 3 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Overall Architecture. Components in this architecture can be divided into two groups, shared feature extractors and task-specific heads. Shared feature extractors include 5 modules, backbone, depth estimator, view projector, temporal fusor and BEV encoder. Task-specific heads include 3d object detection, map segmentation, lane detection and occupancy prediction
  • Figure 2: Architecture of Quadruple Head on Shared BEV Feature. Four independent heads are attached to the BEV feature map in a round-robin manner.
  • Figure 3: Comparisons between Different Learning Rate and Weights Schedule