Table of Contents
Fetching ...

Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images

Giuseppe Cartella, Vittorio Cuculo, Marcella Cornia, Rita Cucchiara

TL;DR

This work collects a novel dataset of partially manipulated images using diffusion models and conducts an eye-tracking experiment to record the eye movements of different observers while viewing real and fake stimuli to explore the distinctive patterns in how humans perceive genuine and altered images.

Abstract

Creating high-quality and realistic images is now possible thanks to the impressive advancements in image generation. A description in natural language of your desired output is all you need to obtain breathtaking results. However, as the use of generative models grows, so do concerns about the propagation of malicious content and misinformation. Consequently, the research community is actively working on the development of novel fake detection techniques, primarily focusing on low-level features and possible fingerprints left by generative models during the image generation process. In a different vein, in our work, we leverage human semantic knowledge to investigate the possibility of being included in frameworks of fake image detection. To achieve this, we collect a novel dataset of partially manipulated images using diffusion models and conduct an eye-tracking experiment to record the eye movements of different observers while viewing real and fake stimuli. A preliminary statistical analysis is conducted to explore the distinctive patterns in how humans perceive genuine and altered images. Statistical findings reveal that, when perceiving counterfeit samples, humans tend to focus on more confined regions of the image, in contrast to the more dispersed observational pattern observed when viewing genuine images. Our dataset is publicly available at: https://github.com/aimagelab/unveiling-the-truth.

Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images

TL;DR

This work collects a novel dataset of partially manipulated images using diffusion models and conducts an eye-tracking experiment to record the eye movements of different observers while viewing real and fake stimuli to explore the distinctive patterns in how humans perceive genuine and altered images.

Abstract

Creating high-quality and realistic images is now possible thanks to the impressive advancements in image generation. A description in natural language of your desired output is all you need to obtain breathtaking results. However, as the use of generative models grows, so do concerns about the propagation of malicious content and misinformation. Consequently, the research community is actively working on the development of novel fake detection techniques, primarily focusing on low-level features and possible fingerprints left by generative models during the image generation process. In a different vein, in our work, we leverage human semantic knowledge to investigate the possibility of being included in frameworks of fake image detection. To achieve this, we collect a novel dataset of partially manipulated images using diffusion models and conduct an eye-tracking experiment to record the eye movements of different observers while viewing real and fake stimuli. A preliminary statistical analysis is conducted to explore the distinctive patterns in how humans perceive genuine and altered images. Statistical findings reveal that, when perceiving counterfeit samples, humans tend to focus on more confined regions of the image, in contrast to the more dispersed observational pattern observed when viewing genuine images. Our dataset is publicly available at: https://github.com/aimagelab/unveiling-the-truth.
Paper Structure (7 sections, 2 figures, 1 table)

This paper contains 7 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Overview of the human gaze patterns when observing real and altered images. Interestingly, humans tend to focus on more circumscribed areas when looking at counterfeit samples. Light-blue masks of edited images represent inpainted regions.
  • Figure 2: Qualitative visualizations of the proposed approach. (a) Image editing examples where the white masks represent the inpainting regions. (b) Histogram of the ratings of realism given by the users in the eye-tracking experiment. (c) Kernel density estimation of the saliency maps' entropy across viewers.