Table of Contents
Fetching ...

On the RAID dataset of perceptual responses: analysis and statistical causes

Paula Daudén-Oliver, David Agost-Beltran, Emilio Sansano-Sansano, Raul Montoliu, Valero Laparra, Jesús Malo, Marina Martínez-Garcia

Abstract

This work analyzes the RAID dataset to evaluate human responses to affine image distortions, including rotation, translation, scaling, and Gaussian noise. Using Mean Squared Error (MSE), the study establishes human detection thresholds for these distortions, enabling comparison across types. Statistical analysis with ANOVA and Tukey Kramer tests reveals that observers are significantly more sensitive to Gaussian noise, which consistently produced the lowest detection thresholds. Fourier analysis further shows that high-frequency components act as a visual mask for Gaussian noise, demonstrating a strong correlation between high frequency energy and detection thresholds. Additionally, spectral orientation influences the perception of rotation. Finally, the study employs the PixelCNN model to show that image probability significantly correlates with detection thresholds for most distortions, suggesting that statistical likelihood affects human visual tolerance.

On the RAID dataset of perceptual responses: analysis and statistical causes

Abstract

This work analyzes the RAID dataset to evaluate human responses to affine image distortions, including rotation, translation, scaling, and Gaussian noise. Using Mean Squared Error (MSE), the study establishes human detection thresholds for these distortions, enabling comparison across types. Statistical analysis with ANOVA and Tukey Kramer tests reveals that observers are significantly more sensitive to Gaussian noise, which consistently produced the lowest detection thresholds. Fourier analysis further shows that high-frequency components act as a visual mask for Gaussian noise, demonstrating a strong correlation between high frequency energy and detection thresholds. Additionally, spectral orientation influences the perception of rotation. Finally, the study employs the PixelCNN model to show that image probability significantly correlates with detection thresholds for most distortions, suggesting that statistical likelihood affects human visual tolerance.

Paper Structure

This paper contains 4 sections, 1 equation, 8 figures, 1 table.

Figures (8)

  • Figure 1: Human thresholds in MSE, calculated between the original and the distorted image (top) and calculated summing the MSE values obtained between consecutive distortion levels, starting from the reference image and the first distorted image (bottom).
  • Figure 2: Box plots of the thresholds calculated in MSE, calculated between the original and the distorted image (left) and calculated summing the MSE values obtained between consecutive distortion levels, starting from the reference image and the first distorted image (right).
  • Figure 3: Correlation of human thresholds between different distortions. The red line shows the fitted slope, and the black line represents unity.
  • Figure 4: Circular filter applied to the images. From left to right:1. Example of a reference image. 2. Reference image in the Fourier domain. 3. Filter applied to the image in the Fourier domain. 4. Image filtered.
  • Figure 5: Pearson correlation between the proportion of high frequency energy and the responses to Gaussian noise for each level of distortion and for the threshold derived from the responses.
  • ...and 3 more figures