What makes an image realistic?
Lucas Theis
TL;DR
This work reframes image realism as a problem of randomness deficiency, introducing universal critics that compare observed data to a mixture over computable data-generating processes. The key construction $U(x) = -\log P(x) - K(x)$ (with $K(x) = -\log S(x)$ and $S(x)$ a Solomonoff-style mixture) provides a normative target for realism that does not rely on adversarial training. Through batched variants $U^B$ and universal-testing concepts, the paper connects information-theoretic limits to practical evaluation and optimization, and shows how MCMC, MDL, and diffusion-guided methods can approximate or motivate these ideas. The discussion situates the framework relative to input complexity and score-distillation approaches, arguing that universal critics offer a principled, flexible foundation for assessing and guiding perceptual realism in generative models while highlighting directions for practical, scalable implementations.
Abstract
The last decade has seen tremendous progress in our ability to generate realistic-looking data, be it images, text, audio, or video. Here, we discuss the closely related problem of quantifying realism, that is, designing functions that can reliably tell realistic data from unrealistic data. This problem turns out to be significantly harder to solve and remains poorly understood, despite its prevalence in machine learning and recent breakthroughs in generative AI. Drawing on insights from algorithmic information theory, we discuss why this problem is challenging, why a good generative model alone is insufficient to solve it, and what a good solution would look like. In particular, we introduce the notion of a universal critic, which unlike adversarial critics does not require adversarial training. While universal critics are not immediately practical, they can serve both as a North Star for guiding practical implementations and as a tool for analyzing existing attempts to capture realism.
