Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective
Yu Cai, Hao Chen, Kwang-Ting Cheng
TL;DR
This work addresses the theoretical foundations of autoencoder-based medical anomaly detection, arguing that traditional reconstruction objectives can fail due to an "identical shortcut" when latent capacity is high. Using information theory, it derives that the latent space entropy $H(\mathbf{Z})$ should align with the normal-data entropy $H(\mathbf{X}_n)$, achievable by appropriately constraining latent dimensionality to minimize encoding of abnormal information while preserving normal content. Empirical results on four datasets across two image modalities show that small, optimally chosen latent dimensions yield strong anomaly-detection performance, often outperforming latent-space restriction methods. The study provides a principled design guideline for AE in anomaly detection and suggests future work on self-adaptive entropy control to eliminate manual tuning barriers.
Abstract
Medical anomaly detection aims to identify abnormal findings using only normal training data, playing a crucial role in health screening and recognizing rare diseases. Reconstruction-based methods, particularly those utilizing autoencoders (AEs), are dominant in this field. They work under the assumption that AEs trained on only normal data cannot reconstruct unseen abnormal regions well, thereby enabling the anomaly detection based on reconstruction errors. However, this assumption does not always hold due to the mismatch between the reconstruction training objective and the anomaly detection task objective, rendering these methods theoretically unsound. This study focuses on providing a theoretical foundation for AE-based reconstruction methods in anomaly detection. By leveraging information theory, we elucidate the principles of these methods and reveal that the key to improving AE in anomaly detection lies in minimizing the information entropy of latent vectors. Experiments on four datasets with two image modalities validate the effectiveness of our theory. To the best of our knowledge, this is the first effort to theoretically clarify the principles and design philosophy of AE for anomaly detection. The code is available at \url{https://github.com/caiyu6666/AE4AD}.
