Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective

Yu Cai; Hao Chen; Kwang-Ting Cheng

Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective

Yu Cai, Hao Chen, Kwang-Ting Cheng

TL;DR

This work addresses the theoretical foundations of autoencoder-based medical anomaly detection, arguing that traditional reconstruction objectives can fail due to an "identical shortcut" when latent capacity is high. Using information theory, it derives that the latent space entropy $H(\mathbf{Z})$ should align with the normal-data entropy $H(\mathbf{X}_n)$, achievable by appropriately constraining latent dimensionality to minimize encoding of abnormal information while preserving normal content. Empirical results on four datasets across two image modalities show that small, optimally chosen latent dimensions yield strong anomaly-detection performance, often outperforming latent-space restriction methods. The study provides a principled design guideline for AE in anomaly detection and suggests future work on self-adaptive entropy control to eliminate manual tuning barriers.

Abstract

Medical anomaly detection aims to identify abnormal findings using only normal training data, playing a crucial role in health screening and recognizing rare diseases. Reconstruction-based methods, particularly those utilizing autoencoders (AEs), are dominant in this field. They work under the assumption that AEs trained on only normal data cannot reconstruct unseen abnormal regions well, thereby enabling the anomaly detection based on reconstruction errors. However, this assumption does not always hold due to the mismatch between the reconstruction training objective and the anomaly detection task objective, rendering these methods theoretically unsound. This study focuses on providing a theoretical foundation for AE-based reconstruction methods in anomaly detection. By leveraging information theory, we elucidate the principles of these methods and reveal that the key to improving AE in anomaly detection lies in minimizing the information entropy of latent vectors. Experiments on four datasets with two image modalities validate the effectiveness of our theory. To the best of our knowledge, this is the first effort to theoretically clarify the principles and design philosophy of AE for anomaly detection. The code is available at \url{https://github.com/caiyu6666/AE4AD}.

Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective

TL;DR

should align with the normal-data entropy

, achievable by appropriately constraining latent dimensionality to minimize encoding of abnormal information while preserving normal content. Empirical results on four datasets across two image modalities show that small, optimally chosen latent dimensions yield strong anomaly-detection performance, often outperforming latent-space restriction methods. The study provides a principled design guideline for AE in anomaly detection and suggests future work on self-adaptive entropy control to eliminate manual tuning barriers.

Abstract

Paper Structure (16 sections, 2 theorems, 6 equations, 3 figures, 3 tables)

This paper contains 16 sections, 2 theorems, 6 equations, 3 figures, 3 tables.

Introduction
The pipeline and limitation of AE in anomaly detection
Theoretical analysis of AE in anomaly detection
An inherent property of AE
The optimal solution for AE
Experiments
Datasets and implementation details
Results and analysis
Validation of Proposition \ref{['prop:1']}.
Validation of Proposition \ref{['prop:2']}.
Comparison with other methods.
Conclusion and Discussion
Acknowledgments.
Disclosure of Interests.
Datasets
...and 1 more sections

Key Result

proposition thmcounterproposition

Given an AE (Fig. fig:overview_rec), let $\mathbf{Z}_0 \in \mathbb{R}^D$ be the feature vector before the latent vector $\mathbf{Z} \in \mathbb{R}^d$, $\hat{\mathbf{Z}}_0 \in \mathbb{R}^D$ be the feature vector after $\mathbf{Z}$. Then, if $d < \frac{D}{2}$, the AE cannot learn identical mapping.

Figures (3)

Figure 1: Overview of the reconstruction AE-based AD. The model is trained to minimize reconstruction loss on normal images $\mathbf{X}_n$. During inference, lesions $\delta$ in abnormal images $\mathbf{X}_a$ are assumed unable to be reconstructed by the trained model.
Figure 2: Venn diagram of $H(\mathbf{X}_n), H(\mathbf{X}_a), H(\mathbf{Z}$).\ref{['1']} (a) Relationship between $H(\mathbf{X}_n)$ and $H(\mathbf{X}_a)$; (b) $H(\mathbf{Z})$ of an AE trained with Eq. \ref{['eq:train']}; (c) $H(\mathbf{Z})$ of an optimal AE.
Figure 3: Reconstruction errors on RSNA dataset w.r.t. the latent dimension.

Theorems & Definitions (4)

proposition thmcounterproposition
proof
proposition thmcounterproposition
proof

Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective

TL;DR

Abstract

Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (4)