GridMask Data Augmentation

Pengguang Chen; Shu Liu; Hengshuang Zhao; Xingquan Wang; Jiaya Jia

GridMask Data Augmentation

Pengguang Chen, Shu Liu, Hengshuang Zhao, Xingquan Wang, Jiaya Jia

TL;DR

GridMask introduces a simple, structured information-dropping augmentation that masks a grid of squares in input images to balance deletion and information preservation. The method, governed by parameters r, d, delta_x, and delta_y, consistently improves performance across ImageNet, COCO, and Cityscapes, often outperforming more complex policies like AutoAugment with far lower computational cost. Ablation studies validate the choice of hyperparameters and the importance of structured dropping over random occlusion. The technique demonstrates strong cross-task generalization and can serve as a strong baseline policy for future augmentation searches. Overall, GridMask provides a practical, scalable, and effective augmentation strategy with broad applicability in computer vision.

Abstract

We propose a novel data augmentation method `GridMask' in this paper. It utilizes information removal to achieve state-of-the-art results in a variety of computer vision tasks. We analyze the requirement of information dropping. Then we show limitation of existing information dropping algorithms and propose our structured method, which is simple and yet very effective. It is based on the deletion of regions of the input image. Our extensive experiments show that our method outperforms the latest AutoAugment, which is way more computationally expensive due to the use of reinforcement learning to find the best policies. On the ImageNet dataset for recognition, COCO2017 object detection, and on Cityscapes dataset for semantic segmentation, our method all notably improves performance over baselines. The extensive experiments manifest the effectiveness and generality of the new method.

GridMask Data Augmentation

TL;DR

Abstract

GridMask Data Augmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)