MixCut:A Data Augmentation Method for Facial Expression Recognition
Jiaxiang Yu, Yiyang Liu, Ruiyang Fan, Guobing Sun
TL;DR
MixCut tackles data-scarce facial expression recognition by combining interpolation and random square masking to generate augmented samples, defined by $x' = M ∙ [λ x_A + (1-λ) x_B]$ and $y' = λ y_A + (1-λ) y_B$ with $ ext{λ} ∼ ext{Beta}(1,1)$ and a masking ratio governed by $eta = 1 - η$, $η ∼ ext{Beta}(1,1)$. The method uses a binary mask $M$ to remove random square regions and a hyperparameter $ ext{γ}$ to control usage probability, reported as 0.5. Evaluations on Fer2013Plus and RAF-DB show MixCut consistently improves over Baseline and other augmentations (Cutout, Mixup, CutMix), achieving 85.63% and 87.88% accuracy respectively, with hyperparameter studies indicating robust performance when $ ext{λ}$, $ ext{β}$, and $ ext{γ}$ are varied around moderate, stochastic values. The findings suggest MixCut is a practical, easily implementable augmentation with potential applicability to broader vision tasks and learning frameworks.
Abstract
In the facial expression recognition task, researchers always get low accuracy of expression classification due to a small amount of training samples. In order to solve this kind of problem, we proposes a new data augmentation method named MixCut. In this method, we firstly interpolate the two original training samples at the pixel level in a random ratio to generate new samples. Then, pixel removal is performed in random square regions on the new samples to generate the final training samples. We evaluated the MixCut method on Fer2013Plus and RAF-DB. With MixCut, we achieved 85.63% accuracy in eight-label classification on Fer2013Plus and 87.88% accuracy in seven-label classification on RAF-DB, effectively improving the classification accuracy of facial expression image recognition. Meanwhile, on Fer2013Plus, MixCut achieved performance improvements of +0.59%, +0.36%, and +0.39% compared to the other three data augmentation methods: CutOut, Mixup, and CutMix, respectively. MixCut improves classification accuracy on RAF-DB by +0.22%, +0.65%, and +0.5% over these three data augmentation methods.
