Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond

Krishna Kumar Singh; Hao Yu; Aron Sarmasi; Gautam Pradeep; Yong Jae Lee

Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond

Krishna Kumar Singh, Hao Yu, Aron Sarmasi, Gautam Pradeep, Yong Jae Lee

TL;DR

Hide-and-Seek introduces a patch-based occlusion data augmentation that hides random patches during training to force networks to utilize multiple object parts, improving weakly-supervised localization and robustness to occlusion. Hidden pixels are filled with the dataset mean $μ$ to align training/testing distributions, and the approach extends to videos via temporal patch hiding. Extensive experiments across object localization, semantic segmentation, temporal action localization, and supervised tasks demonstrate consistent gains across architectures and datasets, highlighting HaS's broad applicability. The work provides practical guidance on patch sizes and hiding probabilities and releases code and models on its project page.

Abstract

We propose 'Hide-and-Seek' a general purpose data augmentation technique, which is complementary to existing data augmentation techniques and is beneficial for various visual recognition tasks. The key idea is to hide patches in a training image randomly, in order to force the network to seek other relevant content when the most discriminative content is hidden. Our approach only needs to modify the input image and can work with any network to improve its performance. During testing, it does not need to hide any patches. The main advantage of Hide-and-Seek over existing data augmentation techniques is its ability to improve object localization accuracy in the weakly-supervised setting, and we therefore use this task to motivate the approach. However, Hide-and-Seek is not tied only to the image localization task, and can generalize to other forms of visual input like videos, as well as other recognition tasks like image classification, temporal action localization, semantic segmentation, emotion recognition, age/gender estimation, and person re-identification. We perform extensive experiments to showcase the advantage of Hide-and-Seek on these various visual recognition problems.

Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond

TL;DR

Abstract

Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)