ColorSense: A Study on Color Vision in Machine Visual Recognition

Ming-Chang Chiu; Yingfei Wang; Derrick Eui Gyu Kim; Pin-Yu Chen; Xuezhe Ma

ColorSense: A Study on Color Vision in Machine Visual Recognition

Ming-Chang Chiu, Yingfei Wang, Derrick Eui Gyu Kim, Pin-Yu Chen, Xuezhe Ma

TL;DR

ColorSense introduces a color-vision-centric benchmarking framework and dataset suite (ColorSense-ImageNet and ColorSense-CIFAR10) to study how color discrimination affects machine vision. By labeling dominant foreground and background colors and grouping color pairs into Hard, Medium, and Easy categories, the authors systematically evaluate a wide range of architectures and model sizes on ImageNet and related tasks, revealing robust color-vision biases that persist across models and training regimes. They also propose sCVR and sCR robustness metrics and demonstrate their utility in assessing performance under color and corruption variations, including high-stakes scenarios like vehicle recognition. The findings highlight a substantive gap between human and machine color perception, the limited effectiveness of current mitigation strategies, and the need for new evaluation frameworks to ensure robust, safe deployment of vision systems in the real world.

Abstract

Color vision is essential for human visual perception, but its impact on machine perception is still underexplored. There has been an intensified demand for understanding its role in machine perception for safety-critical tasks such as assistive driving and surgery but lacking suitable datasets. To fill this gap, we curate multipurpose datasets ColorSense, by collecting 110,000 non-trivial human annotations of foreground and background color labels from popular visual recognition benchmarks. To investigate the impact of color vision on machine perception, we assign each image a color discrimination level based on its dominant foreground and background colors and use it to study the impact of color vision on machine perception. We validate the use of our datasets by demonstrating that the level of color discrimination has a dominating effect on the performance of mainstream machine perception models. Specifically, we examine the perception ability of machine vision by considering key factors such as model architecture, training objective, model size, training data, and task complexity. Furthermore, to investigate how color and environmental factors affect the robustness of visual recognition in machine perception, we integrate our ColorSense datasets with image corruptions and perform a more comprehensive visual perception evaluation. Our findings suggest that object recognition tasks such as classification and localization are susceptible to color vision bias, especially for high-stakes cases such as vehicle classes, and advanced mitigation techniques such as data augmentation and so on only give marginal improvement. Our analyses highlight the need for new approaches toward the performance evaluation of machine perception models in real-world applications. Lastly, we present various potential applications of ColorSense such as studying spurious correlations.

ColorSense: A Study on Color Vision in Machine Visual Recognition

TL;DR

Abstract

Paper Structure (49 sections, 16 figures, 7 tables)

This paper contains 49 sections, 16 figures, 7 tables.

Introduction
Bridging Human Vision and Machine Vision
Human Vision Test
Machine Vision Test
Benchmarking for Color Vision
Labeling Process
Defining Color Discrimination Groups
Evaluation: Image Classification
Fundamental Questions
Does color vision affect DNN models as it affects humans?
Does architecture matter?
Does model size matter?
Main Findings
DNNs are deeply affected by color vision
Model size and architecture do not add obvious robustness to color vision
...and 34 more sections

Figures (16)

Figure 1: Mapping of our ColorSense datasets to human visual aspects and examples of our ColorSense integrated with corrupted images (ImageNet and CIFAR10). (a) ColorSense can evaluate the color vision and more complete visual perception capabilities. (b-c) Examples of ColorSense integrated with corrupted datasets. Each row indicates a color discrimination level.
Figure 2: Examples of our ColorSense datasets and labeling principles. (a) Top two rows are ColorSense-ImageNet-Foreground, and the bottom two rows are ColorSense-ImageNet-Background. In conjunction, we can define color discrimination groups (§ \ref{['sec:datasets']}). (b) ColorSense-CIFAR10. (c) Examples of the three labeling principles. See § \ref{['sec:labeling']} for full labeling details.
Figure 3: Testing models with different architectures and model sizes (ImageNet). We observe similar trends in the three CD groups (Hard, Medium, Easy) across all models, showing the effect of color vision is universal across architectures and model sizes.
Figure 4: Color vision performance of standard and advanced procedure-trained models. Advanced training only makes models slightly more robust to color discrimination (also see Tab. \ref{['tab:AG']}).
Figure 5: Color vision in effect no matter how models are pre-trained.Left: ViT accuracies (%) pre-trained on ImageNet-21k. Right: CLIP 0-shot performances (%) with different training data size.
...and 11 more figures

ColorSense: A Study on Color Vision in Machine Visual Recognition

TL;DR

Abstract

ColorSense: A Study on Color Vision in Machine Visual Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (16)