Table of Contents
Fetching ...

Exploring the Naturalness of AI-Generated Images

Zijian Chen, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhongpeng Ji, Fengyu Sun, Shangling Jui, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

TL;DR

This work tackles the AI-generated image naturalness problem by introducing the AGIN database, which captures human ratings on technical quality, rationality, and overall naturalness across five generative tasks and 18 models. It shows that both low-level technical distortions and high-level rationality factors shape perceived naturalness, and that a simple linear combination of per-perspective scores aligns well with human judgments. To leverage this insight, the authors propose JOINT, a two-branch, brain-inspired framework that jointly learns technical priors and rationality cues, with JOINT++ offering even finer supervision by incorporating per-perspective labels. The results demonstrate that JOINT/JOINT++ achieve superior alignment with human ratings on AGIN compared to a wide range of IQA/IAA baselines, and ablations confirm the value of multi-perspective supervision and artifact-aware training. Overall, AGIN provides a foundation for robust INA of AI-generated media and enables downstream supervision for improving naturalness in AGI outputs.

Abstract

The proliferation of Artificial Intelligence-Generated Images (AGIs) has greatly expanded the Image Naturalness Assessment (INA) problem. Different from early definitions that mainly focus on tone-mapped images with limited distortions (e.g., exposure, contrast, and color reproduction), INA on AI-generated images is especially challenging as it has more diverse contents and could be affected by factors from multiple perspectives, including low-level technical distortions and high-level rationality distortions. In this paper, we take the first step to benchmark and assess the visual naturalness of AI-generated images. First, we construct the AI-Generated Image Naturalness (AGIN) database by conducting a large-scale subjective study to collect human opinions on the overall naturalness as well as perceptions from technical and rationality perspectives. AGIN verifies that naturalness is universally and disparately affected by technical and rationality distortions. Second, we propose the Joint Objective Image Naturalness evaluaTor (JOINT), to automatically predict the naturalness of AGIs that aligns human ratings. Specifically, JOINT imitates human reasoning in naturalness evaluation by jointly learning both technical and rationality features. We demonstrate that JOINT significantly outperforms baselines for providing more subjectively consistent results on naturalness assessment.

Exploring the Naturalness of AI-Generated Images

TL;DR

This work tackles the AI-generated image naturalness problem by introducing the AGIN database, which captures human ratings on technical quality, rationality, and overall naturalness across five generative tasks and 18 models. It shows that both low-level technical distortions and high-level rationality factors shape perceived naturalness, and that a simple linear combination of per-perspective scores aligns well with human judgments. To leverage this insight, the authors propose JOINT, a two-branch, brain-inspired framework that jointly learns technical priors and rationality cues, with JOINT++ offering even finer supervision by incorporating per-perspective labels. The results demonstrate that JOINT/JOINT++ achieve superior alignment with human ratings on AGIN compared to a wide range of IQA/IAA baselines, and ablations confirm the value of multi-perspective supervision and artifact-aware training. Overall, AGIN provides a foundation for robust INA of AI-generated media and enables downstream supervision for improving naturalness in AGI outputs.

Abstract

The proliferation of Artificial Intelligence-Generated Images (AGIs) has greatly expanded the Image Naturalness Assessment (INA) problem. Different from early definitions that mainly focus on tone-mapped images with limited distortions (e.g., exposure, contrast, and color reproduction), INA on AI-generated images is especially challenging as it has more diverse contents and could be affected by factors from multiple perspectives, including low-level technical distortions and high-level rationality distortions. In this paper, we take the first step to benchmark and assess the visual naturalness of AI-generated images. First, we construct the AI-Generated Image Naturalness (AGIN) database by conducting a large-scale subjective study to collect human opinions on the overall naturalness as well as perceptions from technical and rationality perspectives. AGIN verifies that naturalness is universally and disparately affected by technical and rationality distortions. Second, we propose the Joint Objective Image Naturalness evaluaTor (JOINT), to automatically predict the naturalness of AGIs that aligns human ratings. Specifically, JOINT imitates human reasoning in naturalness evaluation by jointly learning both technical and rationality features. We demonstrate that JOINT significantly outperforms baselines for providing more subjectively consistent results on naturalness assessment.
Paper Structure (29 sections, 11 equations, 19 figures, 8 tables)

This paper contains 29 sections, 11 equations, 19 figures, 8 tables.

Figures (19)

  • Figure 1: The proposed AGIN, first-of-this-kind image naturalness assessment database with human opinions from technical, rationality, and overall naturalness perspectives, focusing on 5 generative tasks ( i.e., text-to-image, image translation, image inpainting, image colorization, and image editing).
  • Figure 2: The motivation of naturalness assessment for AI-generated images: multi-perspective settings can effectively avoid the perceptual bias on single absolute evaluation, and provide more accurate judgments to serve as downstream supervision.
  • Figure 3: Workflow of the human evaluation in AGIN: source images are first collected from 5 generative tasks and real-world image datasets (a), and then we conduct in-lab training with instructions (b). After that, subjects are asked to rate the images from three aspects (c), while carrying out tests (d) to control the annotation quality.
  • Figure 4: Data properties of AGIN. (a) The correlations between technical and rationality perspectives, (b) distributions of overall naturalness scores, and (c) the tendency of main factors chosen by participants across different ranges of naturalness scores.
  • Figure 5: More comparisons of image generation models in terms of naturalness-related factors. Zoom-in for better visualization.
  • ...and 14 more figures