Low-Quality Image Detection by Hierarchical VAE
Tomoyasu Nanaumi, Kazuhiko Kawamoto, Hiroshi Kera
TL;DR
The paper tackles unsupervised detection of low-quality images under unseen degradations by exploiting a partial reconstruction failure of a hierarchical VAE. It uses a multi-layer VAE to produce partial reconstructions and defines a KL-divergence-based score, $S_{KL}$, between the posterior on the higher latent variables given the input and the posterior given its partial reconstruction, with adaptive $k$ selected by an FFT-based frequency criterion. Empirical results on FFHQ-256 and ImageNet-64 against several unsupervised OOD baselines show that the proposed method achieves the best average AUROC and yields stable performance across corruption types, while also providing visual clues that help humans recognize degraded images in thumbnail views. This approach offers a scalable, unsupervised data-cleaning tool for assembling high-quality image sets for rosters, photo archives, and generative-model training datasets.
Abstract
To make an employee roster, photo album, or training dataset of generative models, one needs to collect high-quality images while dismissing low-quality ones. This study addresses a new task of unsupervised detection of low-quality images. We propose a method that not only detects low-quality images with various types of degradation but also provides visual clues of them based on an observation that partial reconstruction by hierarchical variational autoencoders fails for low-quality images. The experiments show that our method outperforms several unsupervised out-of-distribution detection methods and also gives visual clues for low-quality images that help humans recognize them even in thumbnail view.
