Exploring Color Invariance through Image-Level Ensemble Learning
Yunpeng Gong, Jiaquan Li, Lifei Chen, Min Jiang
TL;DR
This work tackles color bias caused by lighting and camera variation in wide-area surveillance tasks such as person re-identification and industrial dust segmentation. It introduces Random Color Erasing (RCE), an image-level ensemble learning strategy that combines global grayscale transformations with local grayscale patches to rebalance color features and non-color discriminative cues; the method uses probabilities $p_g$ and $p_r$ to control global and local erasing and can be viewed as an efficient alternative to GAN-based style transfer. The authors provide theoretical insights suggesting that ensembling grayscale- and color-trained components can reduce generalization error, and they validate the approach through extensive experiments across two tasks, multiple datasets, and several strong baselines, with notable improvements in cross-domain settings. Visual analyses with Grad-CAM support the claim that RCE yields more robust attention to truly discriminative regions under color variations, underscoring practical impact for robust color-invariant vision systems.
Abstract
In the field of computer vision, the persistent presence of color bias, resulting from fluctuations in real-world lighting and camera conditions, presents a substantial challenge to the robustness of models. This issue is particularly pronounced in complex wide-area surveillance scenarios, such as person re-identification and industrial dust segmentation, where models often experience a decline in performance due to overfitting on color information during training, given the presence of environmental variations. Consequently, there is a need to effectively adapt models to cope with the complexities of camera conditions. To address this challenge, this study introduces a learning strategy named Random Color Erasing, which draws inspiration from ensemble learning. This strategy selectively erases partial or complete color information in the training data without disrupting the original image structure, thereby achieving a balanced weighting of color features and other features within the neural network. This approach mitigates the risk of overfitting and enhances the model's ability to handle color variation, thereby improving its overall robustness. The approach we propose serves as an ensemble learning strategy, characterized by robust interpretability. A comprehensive analysis of this methodology is presented in this paper. Across various tasks such as person re-identification and semantic segmentation, our approach consistently improves strong baseline methods. Notably, in comparison to existing methods that prioritize color robustness, our strategy significantly enhances performance in cross-domain scenarios. The code available at \url{https://github.com/layumi/Person\_reID\_baseline\_pytorch/blob/master/random\_erasing.py} or \url{https://github.com/finger-monkey/Data-Augmentation}.
