Unveiling the Black Box: Independent Functional Module Evaluation for Bird's-Eye-View Perception Model
Ludan Zhang, Xiaokang Ding, Yuqi Dai, Lei He, Keqiang Li
TL;DR
BEV-IFME addresses the opacity of end-to-end BEV perception by introducing an Independent Functional Module Evaluation framework that maps GT and module feature maps into a shared semantic space (Re-Space) and measures their similarity. A two-stage Alignment AutoEncoder, guided by GT encodings from pre-trained LLMs, yields feature representations whose cosine similarity to GT representations yields a Robust Similarity Score correlated with BEV metrics like mAP and NDS (average 0.9387). The approach enables independent evaluation and hierarchical optimization of functional modules, demonstrating strong cross-configuration stability and guiding training adjustments. Validation on NuScenes-mini across eight module configurations confirms that Similarity Scores track BEV performance, supporting practical use for development efficiency and interpretability in autonomous driving systems.
Abstract
End-to-end models are emerging as the mainstream in autonomous driving perception. However, the inability to meticulously deconstruct their internal mechanisms results in diminished development efficacy and impedes the establishment of trust. Pioneering in the issue, we present the Independent Functional Module Evaluation for Bird's-Eye-View Perception Model (BEV-IFME), a novel framework that juxtaposes the module's feature maps against Ground Truth within a unified semantic Representation Space to quantify their similarity, thereby assessing the training maturity of individual functional modules. The core of the framework lies in the process of feature map encoding and representation aligning, facilitated by our proposed two-stage Alignment AutoEncoder, which ensures the preservation of salient information and the consistency of feature structure. The metric for evaluating the training maturity of functional modules, Similarity Score, demonstrates a robust positive correlation with BEV metrics, with an average correlation coefficient of 0.9387, attesting to the framework's reliability for assessment purposes.
