Table of Contents
Fetching ...

Reason-IAD: Knowledge-Guided Dynamic Latent Reasoning for Explainable Industrial Anomaly Detection

Peng Chen, Chao Huang, Yunkang Cao, Chengliang Liu, Wenqiang Wang, Mingbo Yang, Li Shen, Wenqi Ren, Xiaochun Cao

TL;DR

Reason-IAD addresses the challenge of explainable industrial anomaly detection by combining retrieval-augmented category-specific knowledge with entropy-guided latent reasoning in a multimodal LLM framework. It retrieves top-$k$ category descriptions to condition the model prompt and employs a compact latent space of think tokens $\mathcal{Z}$, updated via a reward based on predictive entropy $\mathcal{H}$ using a REINFORCE-like objective, while dynamically injecting informative visual patches. A dynamic visual injection mechanism concentrates attention on defect-relevant regions, guided by iteration-based rewards to improve evidence-based reasoning. On the MMAD benchmark across seven subtasks, Reason-IAD achieves strong one-shot and zero-shot performance, surpassing several baselines and demonstrating both high accuracy and interpretable reasoning for industrial defects.

Abstract

Industrial anomaly detection demands precise reasoning over fine-grained defect patterns. However, existing multimodal large language models (MLLMs), pretrained on general-domain data, often struggle to capture category-specific anomalies, thereby limiting both detection accuracy and interpretability. To address these limitations, we propose Reason-IAD, a knowledge-guided dynamic latent reasoning framework for explainable industrial anomaly detection. Reason-IAD comprises two core components. First, a retrieval-augmented knowledge module incorporates category-specific textual descriptions into the model input, enabling context-aware reasoning over domain-specific defects. Second, an entropy-driven latent reasoning mechanism conducts iterative exploration within a compact latent space using optimizable latent think tokens, guided by an entropy-based reward that encourages confident and stable predictions. Furthermore, a dynamic visual injection strategy selectively incorporates the most informative image patches into the latent sequence, directing the reasoning process toward regions critical for anomaly detection. Extensive experimental results demonstrate that Reason-IAD consistently outperforms state-of-the-art methods. The code will be publicly available at https://github.com/chenpeng052/Reason-IAD.

Reason-IAD: Knowledge-Guided Dynamic Latent Reasoning for Explainable Industrial Anomaly Detection

TL;DR

Reason-IAD addresses the challenge of explainable industrial anomaly detection by combining retrieval-augmented category-specific knowledge with entropy-guided latent reasoning in a multimodal LLM framework. It retrieves top- category descriptions to condition the model prompt and employs a compact latent space of think tokens , updated via a reward based on predictive entropy using a REINFORCE-like objective, while dynamically injecting informative visual patches. A dynamic visual injection mechanism concentrates attention on defect-relevant regions, guided by iteration-based rewards to improve evidence-based reasoning. On the MMAD benchmark across seven subtasks, Reason-IAD achieves strong one-shot and zero-shot performance, surpassing several baselines and demonstrating both high accuracy and interpretable reasoning for industrial defects.

Abstract

Industrial anomaly detection demands precise reasoning over fine-grained defect patterns. However, existing multimodal large language models (MLLMs), pretrained on general-domain data, often struggle to capture category-specific anomalies, thereby limiting both detection accuracy and interpretability. To address these limitations, we propose Reason-IAD, a knowledge-guided dynamic latent reasoning framework for explainable industrial anomaly detection. Reason-IAD comprises two core components. First, a retrieval-augmented knowledge module incorporates category-specific textual descriptions into the model input, enabling context-aware reasoning over domain-specific defects. Second, an entropy-driven latent reasoning mechanism conducts iterative exploration within a compact latent space using optimizable latent think tokens, guided by an entropy-based reward that encourages confident and stable predictions. Furthermore, a dynamic visual injection strategy selectively incorporates the most informative image patches into the latent sequence, directing the reasoning process toward regions critical for anomaly detection. Extensive experimental results demonstrate that Reason-IAD consistently outperforms state-of-the-art methods. The code will be publicly available at https://github.com/chenpeng052/Reason-IAD.
Paper Structure (21 sections, 9 equations, 14 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 9 equations, 14 figures, 5 tables, 1 algorithm.

Figures (14)

  • Figure 1: Comparison between existing reasoning methods and the proposed Reason-IAD. (a) Existing methods conduct reasoning through explicit chains of thought. (b) Reason-IAD retrieves domain-specific knowledge and identifies anomalies via iterative latent reasoning.
  • Figure 2: Overview of the proposed Reason-IAD. (a) Given a query image, Reason-IAD retrieves the most relevant category-specific descriptions and incorporates them into the model prompt to enhance anomaly awareness. (b) An entropy-guided latent reasoning module iteratively refines latent think tokens and dynamically injects visual evidence to improve reasoning accuracy. (c) Illustration of the iterative latent-space reasoning process, in which reward signals and visual cues progressively guide the model toward the final prediction.
  • Figure 3: Performance gains of Reason-IAD over baseline models under the one-shot setting.
  • Figure 4: Effect of iteration count on anomaly discrimination performance. Increasing iterations consistently improves accuracy while maintaining stability.
  • Figure 5: Comparison of model outputs for anomaly detection.
  • ...and 9 more figures