Table of Contents
Fetching ...

Systematic Literature Review on Vehicular Collaborative Perception -- A Computer Vision Perspective

Lei Wan, Jianxin Zhao, Andreas Wiedholz, Manuel Bied, Mateus Martinez de Lucena, Abhishek Dinkar Jagtap, Andreas Festag, Antônio Augusto Fröhlich, Hannan Ejaz Keen, Alexey Vinel

TL;DR

This systematic review addresses the problem of occlusion and limited sensing in single‑vehicle perception by examining vehicular collaborative perception (CP) from a computer‑vision perspective. It adopts PRISMA 2020 to analyze 106–89 key studies (depending on extract) and proposes a structured taxonomy across modality, collaboration type, and perception tasks, highlighting predominant LiDAR usage, intermediate fusion, and object detection focus. The study critically examines evaluation methodologies, datasets, and metrics, and synthesizes approaches to real‑world challenges such as pose errors, latency, bandwidth constraints, domain shifts, heterogeneity, and adversarial threats, while outlining gaps especially in camera modalities and CP‑specific evaluation. It also outlines opportunities for hardware diversification, data/communication optimization, robust fusion strategies, and open, end‑to‑end evaluation frameworks to accelerate practical CP deployment in autonomous driving. Overall, the review provides a comprehensive foundation for advancing CP research, guiding dataset creation, methodology selection, and evaluation standardization to bridge research and real‑world deployment.

Abstract

The effectiveness of autonomous vehicles relies on reliable perception capabilities. Despite significant advancements in artificial intelligence and sensor fusion technologies, current single-vehicle perception systems continue to encounter limitations, notably visual occlusions and limited long-range detection capabilities. Collaborative Perception (CP), enabled by Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication, has emerged as a promising solution to mitigate these issues and enhance the reliability of autonomous systems. Beyond advancements in communication, the computer vision community is increasingly focusing on improving vehicular perception through collaborative approaches. However, a systematic literature review that thoroughly examines existing work and reduces subjective bias is still lacking. Such a systematic approach helps identify research gaps, recognize common trends across studies, and inform future research directions. In response, this study follows the PRISMA 2020 guidelines and includes 106 peer-reviewed articles. These publications are analyzed based on modalities, collaboration schemes, and key perception tasks. Through a comparative analysis, this review illustrates how different methods address practical issues such as pose errors, temporal latency, communication constraints, domain shifts, heterogeneity, and adversarial attacks. Furthermore, it critically examines evaluation methodologies, highlighting a misalignment between current metrics and CP's fundamental objectives. By delving into all relevant topics in-depth, this review offers valuable insights into challenges, opportunities, and risks, serving as a reference for advancing research in vehicular collaborative perception.

Systematic Literature Review on Vehicular Collaborative Perception -- A Computer Vision Perspective

TL;DR

This systematic review addresses the problem of occlusion and limited sensing in single‑vehicle perception by examining vehicular collaborative perception (CP) from a computer‑vision perspective. It adopts PRISMA 2020 to analyze 106–89 key studies (depending on extract) and proposes a structured taxonomy across modality, collaboration type, and perception tasks, highlighting predominant LiDAR usage, intermediate fusion, and object detection focus. The study critically examines evaluation methodologies, datasets, and metrics, and synthesizes approaches to real‑world challenges such as pose errors, latency, bandwidth constraints, domain shifts, heterogeneity, and adversarial threats, while outlining gaps especially in camera modalities and CP‑specific evaluation. It also outlines opportunities for hardware diversification, data/communication optimization, robust fusion strategies, and open, end‑to‑end evaluation frameworks to accelerate practical CP deployment in autonomous driving. Overall, the review provides a comprehensive foundation for advancing CP research, guiding dataset creation, methodology selection, and evaluation standardization to bridge research and real‑world deployment.

Abstract

The effectiveness of autonomous vehicles relies on reliable perception capabilities. Despite significant advancements in artificial intelligence and sensor fusion technologies, current single-vehicle perception systems continue to encounter limitations, notably visual occlusions and limited long-range detection capabilities. Collaborative Perception (CP), enabled by Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication, has emerged as a promising solution to mitigate these issues and enhance the reliability of autonomous systems. Beyond advancements in communication, the computer vision community is increasingly focusing on improving vehicular perception through collaborative approaches. However, a systematic literature review that thoroughly examines existing work and reduces subjective bias is still lacking. Such a systematic approach helps identify research gaps, recognize common trends across studies, and inform future research directions. In response, this study follows the PRISMA 2020 guidelines and includes 106 peer-reviewed articles. These publications are analyzed based on modalities, collaboration schemes, and key perception tasks. Through a comparative analysis, this review illustrates how different methods address practical issues such as pose errors, temporal latency, communication constraints, domain shifts, heterogeneity, and adversarial attacks. Furthermore, it critically examines evaluation methodologies, highlighting a misalignment between current metrics and CP's fundamental objectives. By delving into all relevant topics in-depth, this review offers valuable insights into challenges, opportunities, and risks, serving as a reference for advancing research in vehicular collaborative perception.

Paper Structure

This paper contains 89 sections, 1 equation, 8 figures, 30 tables.

Figures (8)

  • Figure 1: Illustration of a road traffic scenario for CP: The green shaded areas represent the ego-vehicle’s (white) and the CAV’s (red) . The ego vehicle cannot perceive the pedestrians on its right due to the visual occlusion caused by a building, blocking its line of sight. Additionally, another vehicle (blue) on the opposite side of the intersection lies outside the ego vehicle’s perception range, presenting as the long-range problem. However, the CAV and infrastructure roadside unit can detect the pedestrians and the other vehicle, respectively, and share their observations with the ego vehicle, thereby enhancing its situational awareness.
  • Figure 2: Organization of this SLR.
  • Figure 3: The procedure of SLR in three stages: planning (review protocol development), conducting (screening and selection of articles), and documenting (synthesizing of findings).
  • Figure 4: Number of publications over the past five years
  • Figure 5: Procedure of the search and selection, starting from 4876 items, reduced to 3980 after duplicates were removed, 249 after screening and snowballing, and resulting in 106 studies included in the final review.
  • ...and 3 more figures