Mitigating the Likelihood Paradox in Flow-based OOD Detection via Entropy Manipulation
Donghwan Kim, Hyunsoo Yoon
TL;DR
Normalizing flows can misrank in-distribution vs. out-of-distribution inputs due to entropy-related effects in likelihood. The authors propose SPEM, a training-free, test-time entropy manipulation method that scales perturbations based on semantic similarity from an in-distribution memory bank, preserving the underlying likelihood score while enhancing separation. They prove lower bounds showing that entropy perturbations can widen the ID/OOD log-likelihood gap and validate SPEM across ten ID/OOD pairs, achieving consistent AUROC gains over baselines and showing robustness to entropy ordering. Analyzing SPEM-noise reveals scenarios where perturbing with Gaussian noise alone can yield strong separation, highlighting the role of entropy and KL terms. Overall, SPEM offers a practical, architecture-agnostic approach to align likelihood-based OOD detection with semantic typicality without additional model training.
Abstract
Deep generative models that can tractably compute input likelihoods, including normalizing flows, often assign unexpectedly high likelihoods to out-of-distribution (OOD) inputs. We mitigate this likelihood paradox by manipulating input entropy based on semantic similarity, applying stronger perturbations to inputs that are less similar to an in-distribution memory bank. We provide a theoretical analysis showing that entropy control increases the expected log-likelihood gap between in-distribution and OOD samples in favor of the in-distribution, and we explain why the procedure works without any additional training of the density model. We then evaluate our method against likelihood-based OOD detectors on standard benchmarks and find consistent AUROC improvements over baselines, supporting our explanation.
