OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning

Siyuan Li; Zedong Wang; Zicheng Liu; Juanxi Tian; Di Wu; Cheng Tan; Weiyang Jin; Stan Z. Li

OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning

Siyuan Li, Zedong Wang, Zicheng Liu, Juanxi Tian, Di Wu, Cheng Tan, Weiyang Jin, Stan Z. Li

TL;DR

OpenMixup is introduced, the first mixup augmentation codebase, and benchmark for visual representation learning, and a collection of popular vision backbones, optimization strategies, and analysis toolkits, which not only supports the benchmarking but enables broader mixup applications beyond classification.

Abstract

Mixup augmentation has emerged as a widely used technique for improving the generalization ability of deep neural networks (DNNs). However, the lack of standardized implementations and benchmarks has impeded recent progress, resulting in poor reproducibility, unfair comparisons, and conflicting insights. In this paper, we introduce OpenMixup, the first mixup augmentation codebase, and benchmark for visual representation learning. Specifically, we train 18 representative mixup baselines from scratch and rigorously evaluate them across 11 image datasets of varying scales and granularity, ranging from fine-grained scenarios to complex non-iconic scenes. We also open-source our modular codebase, including a collection of popular vision backbones, optimization strategies, and analysis toolkits, which not only supports the benchmarking but enables broader mixup applications beyond classification, such as self-supervised learning and regression tasks. Through experiments and empirical analysis, we gain observations and insights on mixup performance-efficiency trade-offs, generalization, and optimization behaviors, and thereby identify preferred choices for different needs. To the best of our knowledge, OpenMixup has facilitated several recent studies. We believe this work can further advance reproducible mixup augmentation research and thereby lay a solid ground for future progress in the community. The source code and user documents are available at \url{https://github.com/Westlake-AI/openmixup}.

OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning

TL;DR

Abstract

Paper Structure (48 sections, 3 equations, 13 figures, 20 tables)

This paper contains 48 sections, 3 equations, 13 figures, 20 tables.

Introduction
Background and Related Work
Problem Definition
Mixup Training.
Mixup Reformulation.
Sample Mixing
Static Policies.
Dynamic Policies.
Label Mixing
Other Applications
OpenMixup
Benchmarked Methods
Benchmarking Tasks
Evaluation Metrics and Tools
Performance Metric.
...and 33 more sections

Figures (13)

Figure 1: Radar plot of top-1 accuracy for representative mixup baselines on 11 classification datasets.
Figure 2: Visualization of mixed samples from representative static and dynamic mixup augmentation methods on ImageNet-1K. We employ a mixing ratio of $\lambda=0.5$ for a comprehensive comparison. Note that mixed samples are more precisely in dynamic mixing policies than these static ones.
Figure 3: Overview of codebase framework of OpenMixup. (1) benchmarks provide benchmarking results and corresponding config files for mixup classification and transfer learning. (2) openmixup contains implementations of all supported methods. (3) configs is responsible for customizing setups of different mixup methods, networks, datasets, and training pipelines. (4) docs & tools contains paper lists of popular mixup methods, user documentation, and useful tools.
Figure 4: Trade-off evaluation with respect to accuracy performance, total training time (hours), and GPU memory (G). The results in (a) are based on DeiT-S architecture on ImageNet-1K. The results in (b) and (c) are based on DeiT-S and ConvNeXt-T backbones on CIFAR-100, respectively.
Figure 5: (a)(b) Training epoch vs. top-1 accuracy (%) plots of different mixup methods on CIFAR-100 to analyze training stability and convergence speed. (c) 1-D loss landscapes for mixup methods with ResNet-50 (300 epochs) on ImageNet-1K. The results show that dynamic approaches achieve deeper and wider loss landscapes than static ones, which may indicate better optimization behavior.
...and 8 more figures

OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning

TL;DR

Abstract

OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (13)