Table of Contents
Fetching ...

Multi-Flow: Multi-View-Enriched Normalizing Flows for Industrial Anomaly Detection

Mathis Kruse, Bodo Rosenhahn

TL;DR

Multi-Flow introduces a multi-view normalizing flow for industrial anomaly detection, explicitly fusing information across multiple viewpoints to improve exact likelihood estimation on object patches. The method operates on features from a frozen extractor, uses background removal via MVANet, and employs cross-view st-Networks with top-view and neighbor-view connections to share information across views. Training maximizes likelihood with a noise-conditioned flow and a change-of-variables loss, achieving state-of-the-art performance on Real-IAD for both sample-wise and image-wise anomaly detection, with ablations confirming the value of cross-view fusion and background removal. The approach demonstrates strong practical impact for visual inspection in manufacturing, enabling view-agnostic anomaly detection and robust performance with scalable multi-view setups.

Abstract

With more well-performing anomaly detection methods proposed, many of the single-view tasks have been solved to a relatively good degree. However, real-world production scenarios often involve complex industrial products, whose properties may not be fully captured by one single image. While normalizing flow based approaches already work well in single-camera scenarios, they currently do not make use of the priors in multi-view data. We aim to bridge this gap by using these flow-based models as a strong foundation and propose Multi-Flow, a novel multi-view anomaly detection method. Multi-Flow makes use of a novel multi-view architecture, whose exact likelihood estimation is enhanced by fusing information across different views. For this, we propose a new cross-view message-passing scheme, letting information flow between neighboring views. We empirically validate it on the real-world multi-view data set Real-IAD and reach a new state-of-the-art, surpassing current baselines in both image-wise and sample-wise anomaly detection tasks.

Multi-Flow: Multi-View-Enriched Normalizing Flows for Industrial Anomaly Detection

TL;DR

Multi-Flow introduces a multi-view normalizing flow for industrial anomaly detection, explicitly fusing information across multiple viewpoints to improve exact likelihood estimation on object patches. The method operates on features from a frozen extractor, uses background removal via MVANet, and employs cross-view st-Networks with top-view and neighbor-view connections to share information across views. Training maximizes likelihood with a noise-conditioned flow and a change-of-variables loss, achieving state-of-the-art performance on Real-IAD for both sample-wise and image-wise anomaly detection, with ablations confirming the value of cross-view fusion and background removal. The approach demonstrates strong practical impact for visual inspection in manufacturing, enabling view-agnostic anomaly detection and robust performance with scalable multi-view setups.

Abstract

With more well-performing anomaly detection methods proposed, many of the single-view tasks have been solved to a relatively good degree. However, real-world production scenarios often involve complex industrial products, whose properties may not be fully captured by one single image. While normalizing flow based approaches already work well in single-camera scenarios, they currently do not make use of the priors in multi-view data. We aim to bridge this gap by using these flow-based models as a strong foundation and propose Multi-Flow, a novel multi-view anomaly detection method. Multi-Flow makes use of a novel multi-view architecture, whose exact likelihood estimation is enhanced by fusing information across different views. For this, we propose a new cross-view message-passing scheme, letting information flow between neighboring views. We empirically validate it on the real-world multi-view data set Real-IAD and reach a new state-of-the-art, surpassing current baselines in both image-wise and sample-wise anomaly detection tasks.

Paper Structure

This paper contains 24 sections, 7 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Overview of the multi-view enriched normalizing flow compared to SimpleNet simplenet. Multiple views of an object (with anomalies marked) get processed by the normalizing flow. Where single-view methods may struggle, Multi-Flow detects anomalies irrespective of the objects view point.
  • Figure 2: RealNVP RealNVP coupling block with conditioning. The input $y^{in}$ is augmented with a noise vector $\epsilon$ and split into two parts along its channel dimension. Each split and the conditioning noise component $\epsilon$ is concatenated and given to an $st$-network, which calculates components $s$ and $t$ for transforming the opposing path.
  • Figure 3: Examples of the background removal using the pre-trained dichotomous segmentation network MVANet MVANet on Real-IAD realiad. The usually homogeneous background regions can be extracted in all cases and without any grave mistakes.
  • Figure 4: Architecture for sharing information across views. Each $st$-network implements one of these blocks. The raw input consists of a feature map for each of the object views. Each feature is finetuned by being passed through a ConvBlock. Then, subsequent 2D convolutions are applied and the data is aggregated according to the multi-view setup in \ref{['eq:neighbor_sums']}. Top-view connections let information flow from the birds-eye view to all others. Neighbor-view connections lets the information flow between adjacent side views of the object.
  • Figure 5: Qualitative results of detecting anomalies in various multi-view poses. For each object, the top row contains the image with anomalies marked. The middle row is the raw anomaly map output by the model. The lower row contains a superimposition of both input image and anomaly map values. The depicted classes are "Switch", "PCB", "Fire Hood", and "Toy Brick".