Bandwidth-adaptive Cloud-Assisted 360-Degree 3D Perception for Autonomous Vehicles

Faisal Hawladera; Rui Meireles; Gamal Elghazaly; Ana Aguiar; Raphaël Frank

Bandwidth-adaptive Cloud-Assisted 360-Degree 3D Perception for Autonomous Vehicles

Faisal Hawladera, Rui Meireles, Gamal Elghazaly, Ana Aguiar, Raphaël Frank

TL;DR

This approach utilizes transformer-based models to fuse multi-camera sensor data into a comprehensive Bird's-Eye View (BEV) representation, enabling accurate 360-degree 3D object detection and reduces overall latency through leveraging Vehicle-to-Everything (V2X) communication.

Abstract

A key challenge for autonomous driving lies in maintaining real-time situational awareness regarding surrounding obstacles under strict latency constraints. The high processing requirements coupled with limited onboard computational resources can cause delay issues, particularly in complex urban settings. To address this, we propose leveraging Vehicle-to-Everything (V2X) communication to partially offload processing to the cloud, where compute resources are abundant, thus reducing overall latency. Our approach utilizes transformer-based models to fuse multi-camera sensor data into a comprehensive Bird's-Eye View (BEV) representation, enabling accurate 360-degree 3D object detection. The computation is dynamically split between the vehicle and the cloud based on the number of layers processed locally and the quantization level of the features. To further reduce network load, we apply feature vector clipping and compression prior to transmission. In a real-world experimental evaluation, our hybrid strategy achieved a 72 \% reduction in end-to-end latency compared to a traditional onboard solution. To adapt to fluctuating network conditions, we introduce a dynamic optimization algorithm that selects the split point and quantization level to maximize detection accuracy while satisfying real-time latency constraints. Trace-based evaluation under realistic bandwidth variability shows that this adaptive approach improves accuracy by up to 20 \% over static parameterization with the same latency performance.

Bandwidth-adaptive Cloud-Assisted 360-Degree 3D Perception for Autonomous Vehicles

TL;DR

Abstract

Paper Structure (18 sections, 1 theorem, 4 equations, 11 figures, 4 tables, 5 algorithms)

This paper contains 18 sections, 1 theorem, 4 equations, 11 figures, 4 tables, 5 algorithms.

Introduction
Related Work
Methodology
Test Scenarios & Routes
Hardware Configuration & Detection Model
Input Dataset and Evaluation Metric
Lightweight Features Offloading: Hybrid Computing
CPM Encoding
Experiments and Results
Onboard Computing and CPM Transmission
Hybrid Computing and Lightweight Features Sharing
Dynamic Hybrid-Computing Parameter Selection
Problem Formulation
Dynamic Optimization Algorithm
Evaluation
...and 3 more sections

Key Result

Theorem 1

$optPar()$ returns the highest nds-yielding parameter tuple $(split, q)$ that satisfies the latency bound $lat_{total} \leq lat_{max}$ or, if no such tuple exists, the tuple that minimizes $lat_{total}$.

Figures (11)

Figure 1: In the Onboard Computing scenario, the BEVFormer model runs locally, transmitting detection results as CPMs over ITS-G5. In the Hybrid Computing scenario, a compressed feature vector is sent via C-V2X to the cloud for intensive processing, with detection results broadcast to nearby vehicles.
Figure 2: A basic overview of different containers included in the CPM message format as defined by the ETSI standard ts2023103.
Figure 3: CPM transmission latency versus distance between a moving vehicle (25kmh avg.) and a stationary receiver at fixed coordinates (longitude: 6.161993, latitude: 49.626478). Dashed line shows average latency 4.10ms.
Figure 4: Feature vector size and extraction time versus split depth. Solid lines represent feature extraction time (left y-axis), while dashed lines indicate feature size (right y-axis). Lower-precision quantization (e.g., FP16, FP8) reduces both extraction time and feature size.
Figure 5: Transmission latency of feature vectors from vehicle to cloud across five split layers for FP32, FP16, and FP8 over a 5G network using C-V2X. FP8 demonstrates the lowest and most stable latency, suitable for real-time transmission. FP32 exhibits the highest latency and variability, especially at earlier split layers due to larger feature size.
...and 6 more figures

Theorems & Definitions (1)

Theorem 1

Bandwidth-adaptive Cloud-Assisted 360-Degree 3D Perception for Autonomous Vehicles

TL;DR

Abstract

Bandwidth-adaptive Cloud-Assisted 360-Degree 3D Perception for Autonomous Vehicles

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (1)