Table of Contents
Fetching ...

COOPERTRIM: Adaptive Data Selection for Uncertainty-Aware Cooperative Perception

Shilpa Mukhopadhyay, Amit Roy-Chowdhury, Hang Qiu

TL;DR

Qualitative results show COOPERTRIM gracefully adapts to environmental dynamics, localization error, and communication latency, demonstrating flexibility and paving the way for real-world deployment.

Abstract

Cooperative perception enables autonomous agents to share encoded representations over wireless communication to enhance each other's live situational awareness. However, the tension between the limited communication bandwidth and the rich sensor information hinders its practical deployment. Recent studies have explored selection strategies that share only a subset of features per frame while striving to keep the performance on par. Nevertheless, the bandwidth requirement still stresses current wireless technologies. To fundamentally ease the tension, we take a proactive approach, exploiting the temporal continuity to identify features that capture environment dynamics, while avoiding repetitive and redundant transmission of static information. By incorporating temporal awareness, agents are empowered to dynamically adapt the sharing quantity according to environment complexity. We instantiate this intuition into an adaptive selection framework, COOPERTRIM, which introduces a novel conformal temporal uncertainty metric to gauge feature relevance, and a data-driven mechanism to dynamically determine the sharing quantity. To evaluate COOPERTRIM, we take semantic segmentation and 3D detection as example tasks. Across multiple open-source cooperative segmentation and detection models, COOPERTRIM achieves up to 80.28% and 72.52% bandwidth reduction respectively while maintaining a comparable accuracy. Relative to other selection strategies, COOPERTRIM also improves IoU by as much as 45.54% with up to 72% less bandwidth. Combined with compression strategies, COOPERTRIM can further reduce bandwidth usage to as low as 1.46% without compromising IoU performance. Qualitative results show COOPERTRIM gracefully adapts to environmental dynamics, localization error, and communication latency, demonstrating flexibility and paving the way for real-world deployment.

COOPERTRIM: Adaptive Data Selection for Uncertainty-Aware Cooperative Perception

TL;DR

Qualitative results show COOPERTRIM gracefully adapts to environmental dynamics, localization error, and communication latency, demonstrating flexibility and paving the way for real-world deployment.

Abstract

Cooperative perception enables autonomous agents to share encoded representations over wireless communication to enhance each other's live situational awareness. However, the tension between the limited communication bandwidth and the rich sensor information hinders its practical deployment. Recent studies have explored selection strategies that share only a subset of features per frame while striving to keep the performance on par. Nevertheless, the bandwidth requirement still stresses current wireless technologies. To fundamentally ease the tension, we take a proactive approach, exploiting the temporal continuity to identify features that capture environment dynamics, while avoiding repetitive and redundant transmission of static information. By incorporating temporal awareness, agents are empowered to dynamically adapt the sharing quantity according to environment complexity. We instantiate this intuition into an adaptive selection framework, COOPERTRIM, which introduces a novel conformal temporal uncertainty metric to gauge feature relevance, and a data-driven mechanism to dynamically determine the sharing quantity. To evaluate COOPERTRIM, we take semantic segmentation and 3D detection as example tasks. Across multiple open-source cooperative segmentation and detection models, COOPERTRIM achieves up to 80.28% and 72.52% bandwidth reduction respectively while maintaining a comparable accuracy. Relative to other selection strategies, COOPERTRIM also improves IoU by as much as 45.54% with up to 72% less bandwidth. Combined with compression strategies, COOPERTRIM can further reduce bandwidth usage to as low as 1.46% without compromising IoU performance. Qualitative results show COOPERTRIM gracefully adapts to environmental dynamics, localization error, and communication latency, demonstrating flexibility and paving the way for real-world deployment.
Paper Structure (20 sections, 1 theorem, 14 equations, 11 figures, 5 tables)

This paper contains 20 sections, 1 theorem, 14 equations, 11 figures, 5 tables.

Key Result

Theorem 1

An $\epsilon$-greedy training strategy that computes the gradient of Loss $\text{L}$ using full data ($D_{\text{full}}$) with probability $\epsilon$ and partial data ($D_{\text{partial}}$) with probability $(1 - \epsilon)$ reduces the bias of the gradient estimator compared to using only partial dat

Figures (11)

  • Figure 1: (a) CooperTrim Overview. CooperTrim conducts feature learning, followed by an uncertainty-based selection module using learned features. It estimates adaptive temporal uncertainty (via learned confidence) for each feature, performs cross-attention-based feature weighting, and selects features using a learned threshold. The ego then broadcasts a request vector for selected features, reconstructs received CAV data into full features, fuses them, and sends them to the task head for final results. (b) Cross-Attention Module uses learned projections of temporal uncertainty as queries, and feature projections as keys and values. These matrices pass through an attention module, and a learned threshold at the final output generates a binary mask for selected channels.
  • Figure 2: "Trimming" existing cooperative perception baselines, CooperTrim reduces bandwidth significantly while preserving accuracy in segmentation (across different semantics, i.e., dynamic, static road, and static lane) and 3D detection tasks.
  • Figure 3: Across-the-board Bandwidth Comparison. C: Compression. FS: Feature Selection. AS: Agent Selection. CooperTrim consumes the lowest bandwidth among baselines.
  • Figure 4: Comparison of IoU performance and bandwidth usage at compression rates (1x, 8x, 32x) for CooperTrim, CoBEVT, and AttFuse in Dynamic, Road, and Lane scenarios.
  • Figure 5: Increased data requests align with higher scene complexity. For dynamic objects, complexity in number and positioning rises in Frames 1200, 200, and 1700. For static elements, complexity grows in Frames 900, 250, and 1600, with more intersections and lane orientations. Visualizations on the right highlight the frames at vertical lines; Frame 1200, 200, 1700 (dynamic) and 900, 250, 1600 (static). Green dashed lines show baseline CoBEVT IoU. Green solid lines show CooperTrim IoU.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Theorem 1: Effectiveness of $\epsilon$-Greedy Training
  • proof