Do you see what I see? An Ambiguous Optical Illusion Dataset exposing limitations of Explainable AI
Carina Newen, Luca Hinkamp, Maria Ntonti, Emmanuel Müller
TL;DR
This paper identifies a fundamental gap in explainable AI for visual data: pixel-level attributions often fail to resolve perceptual ambiguity inherent in optical illusions. It proposes gaze direction and eye-position as generalizable concepts to guide learning and explanations, and introduces Ambivision, an open-source dataset of two-animal optical illusions with bounding boxes and explicit gaze/eye annotations. Across multiple architectures, experiments show that incorporating these concept-level cues improves classification accuracy on ambiguous images and reveals limitations of standard XAI methods like Grad-CAM, Integrated Gradients, and PipNet. The work highlights bias-mitigation strategies in synthetic data generation and advocates a shift toward concept-based explanations, with potential impact on safety-critical vision tasks.
Abstract
From uncertainty quantification to real-world object detection, we recognize the importance of machine learning algorithms, particularly in safety-critical domains such as autonomous driving or medical diagnostics. In machine learning, ambiguous data plays an important role in various machine learning domains. Optical illusions present a compelling area of study in this context, as they offer insight into the limitations of both human and machine perception. Despite this relevance, optical illusion datasets remain scarce. In this work, we introduce a novel dataset of optical illusions featuring intermingled animal pairs designed to evoke perceptual ambiguity. We identify generalizable visual concepts, particularly gaze direction and eye cues, as subtle yet impactful features that significantly influence model accuracy. By confronting models with perceptual ambiguity, our findings underscore the importance of concepts in visual learning and provide a foundation for studying bias and alignment between human and machine vision. To make this dataset useful for general purposes, we generate optical illusions systematically with different concepts discussed in our bias mitigation section. The dataset is accessible in Kaggle via https://kaggle.com/datasets/693bf7c6dd2cb45c8a863f9177350c8f9849a9508e9d50526e2ffcc5559a8333. Our source code can be found at https://github.com/KDD-OpenSource/Ambivision.git.
