Table of Contents
Fetching ...

A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective

Huaiyuan Xu, Junliang Chen, Shiyu Meng, Yi Wang, Lap-Pui Chau

TL;DR

This survey reviews the most recent works on 3D occupancy perception, and provides in-depth analyses of methodologies with various input modalities, and evaluates and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets.

Abstract

3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this paper will inspire the community and encourage more research work on 3D occupancy perception. A comprehensive list of studies in this survey is publicly available in an active repository that continuously collects the latest work: https://github.com/HuaiyuanXu/3D-Occupancy-Perception.

A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective

TL;DR

This survey reviews the most recent works on 3D occupancy perception, and provides in-depth analyses of methodologies with various input modalities, and evaluates and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets.

Abstract

3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this paper will inspire the community and encourage more research work on 3D occupancy perception. A comprehensive list of studies in this survey is publicly available in an active repository that continuously collects the latest work: https://github.com/HuaiyuanXu/3D-Occupancy-Perception.
Paper Structure (40 sections, 18 equations, 7 figures, 6 tables)

This paper contains 40 sections, 18 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Autonomous driving vehicle system. The sensing data from cameras, LiDAR, and radar enable the vehicle to intelligently perceive its surroundings. Subsequently, the intelligent decision module generates control and planning of driving behavior. Occupancy perception surpasses other perception methods based on perspective view, bird's-eye view, or point clouds, in terms of 3D understanding and density.
  • Figure 2: Chronological overview of 3D occupancy perception. It can be observed that: (1) research on occupancy has undergone explosive growth since 2023; (2) the predominant trend focuses on vision-centric occupancy, supplemented by LiDAR-centric and multi-modal methods.
  • Figure 3: Illustration of voxel-wise representations with and without semantics. The left voxel volume depicts the overall occupancy distribution. The right voxel volume incorporates semantic enrichment, where each voxel is associated with a class estimation.
  • Figure 4: Architecture for LiDAR-centric occupancy perception: Solely the 2D branch rist2021semanticzuo2023pointocc, solely the 3D branch openoccupancyroldao2020lmscnetmin2023occupancy, and integrating both 2D and 3D branches cheng2021s3cnet.
  • Figure 5: Architecture for vision-centric occupancy perception: Methods without temporal fusion gan2023simplehuang2023triocc3dzhang2023occnerfsurroundoccxu2024regulatinghou2024fastoccpan2023renderocchuang2023selfoccboeder2024occflownet; Methods with temporal fusion wang2023panooccopenocccam4doccsilva2024s2tpvformeryu2023flashoccma2023cotr.
  • ...and 2 more figures