Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Yifan Liu, Xun Xu, Shijie Li, Jingyi Liao, Xulei Yang
TL;DR
This work tackles multi-view industrial anomaly detection by embedding geometric priors into cross-view fusion. It introduces an Epipolar Attention Module (EAM) that constrains cross-view attention along epipolar lines, and a multi-center pre-training (MCP) strategy with per-view memory banks and synthetic negative samples to stabilize learning. The combination yields a memory-bank–based, geometry-aware framework (MVEAD) that outperforms state-of-the-art methods on Real-IAD in both sample- and image-level metrics, especially under multi-class settings. The approach offers practical benefits for real-world inspection pipelines by improving robustness and efficiency in multi-view anomaly localization.
Abstract
Multi-camera systems provide richer contextual information for industrial anomaly detection. However, traditional methods process each view independently, disregarding the complementary information across viewpoints. Existing multi-view anomaly detection approaches typically employ data-driven cross-view attention for feature fusion but fail to leverage the unique geometric properties of multi-camera setups. In this work, we introduce an epipolar geometry-constrained attention module to guide cross-view fusion, ensuring more effective information aggregation. To further enhance the potential of cross-view attention, we propose a pretraining strategy inspired by memory bank-based anomaly detection. This approach encourages normal feature representations to form multiple local clusters and incorporate multi-view aware negative sample synthesis to regularize pretraining. We demonstrate that our epipolar guided multi-view anomaly detection framework outperforms existing methods on the state-of-the-art multi-view anomaly detection dataset.
