Table of Contents
Fetching ...

Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning

Jingyang Li, Jiachun Pan, Vincent Y. F. Tan, Kim-Chuan Toh, Pan Zhou

TL;DR

The first theoretical justification for the enhanced test accuracy observed in FixMatch-like SSL applied to DNNs is presented by taking convolutional neural networks on classification tasks as an example and can be applied to other FixMatch-like SSL methods, e.g., FlexMatch, FreeMatch, Dash, and SoftMatch.

Abstract

Semi-supervised learning (SSL), exemplified by FixMatch (Sohn et al., 2020), has shown significant generalization advantages over supervised learning (SL), particularly in the context of deep neural networks (DNNs). However, it is still unclear, from a theoretical standpoint, why FixMatch-like SSL algorithms generalize better than SL on DNNs. In this work, we present the first theoretical justification for the enhanced test accuracy observed in FixMatch-like SSL applied to DNNs by taking convolutional neural networks (CNNs) on classification tasks as an example. Our theoretical analysis reveals that the semantic feature learning processes in FixMatch and SL are rather different. In particular, FixMatch learns all the discriminative features of each semantic class, while SL only randomly captures a subset of features due to the well-known lottery ticket hypothesis. Furthermore, we show that our analysis framework can be applied to other FixMatch-like SSL methods, e.g., FlexMatch, FreeMatch, Dash, and SoftMatch. Inspired by our theoretical analysis, we develop an improved variant of FixMatch, termed Semantic-Aware FixMatch (SA-FixMatch). Experimental results corroborate our theoretical findings and the enhanced generalization capability of SA-FixMatch.

Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning

TL;DR

The first theoretical justification for the enhanced test accuracy observed in FixMatch-like SSL applied to DNNs is presented by taking convolutional neural networks on classification tasks as an example and can be applied to other FixMatch-like SSL methods, e.g., FlexMatch, FreeMatch, Dash, and SoftMatch.

Abstract

Semi-supervised learning (SSL), exemplified by FixMatch (Sohn et al., 2020), has shown significant generalization advantages over supervised learning (SL), particularly in the context of deep neural networks (DNNs). However, it is still unclear, from a theoretical standpoint, why FixMatch-like SSL algorithms generalize better than SL on DNNs. In this work, we present the first theoretical justification for the enhanced test accuracy observed in FixMatch-like SSL applied to DNNs by taking convolutional neural networks (CNNs) on classification tasks as an example. Our theoretical analysis reveals that the semantic feature learning processes in FixMatch and SL are rather different. In particular, FixMatch learns all the discriminative features of each semantic class, while SL only randomly captures a subset of features due to the well-known lottery ticket hypothesis. Furthermore, we show that our analysis framework can be applied to other FixMatch-like SSL methods, e.g., FlexMatch, FreeMatch, Dash, and SoftMatch. Inspired by our theoretical analysis, we develop an improved variant of FixMatch, termed Semantic-Aware FixMatch (SA-FixMatch). Experimental results corroborate our theoretical findings and the enhanced generalization capability of SA-FixMatch.

Paper Structure

This paper contains 55 sections, 11 theorems, 97 equations, 5 figures, 10 tables, 1 algorithm.

Key Result

Theorem 4

Suppose Assumptions assum1, assum2 hold. For sufficiently large $k$ and $m = \mathrm{polylog}(k)$, setting $\eta \leq 1/\mathrm{poly}(k)$ and running FixMatch for $T = \mathrm{poly}(k)/\eta$ iterations ensures: (a) Training performance is good. For all training samples $(X,y)\in\mathcal{Z}$, with pr (b) Test performance is good. With probability at least $1-e^{-\Omega(\log^2k)}$ over the selection

Figures (5)

  • Figure 1: Visualization of pretrained ResNet-50 he2016deep using Grad-CAM. ResNet-50 locates different regions for different car images, e.g., wheel, rearview mirror, front light, and door.
  • Figure 2: Visualization of WRN-28-8 via Grad-CAM on CIFAR-100. Each group of three images corresponds to models trained with SL (left), FixMatch (middle), and SA-FixMatch (right).
  • Figure 3: Visualization of the effects of CutOut (first row), Solarize (second row), and Equalize (third row) on CIFAR-100 images.
  • Figure 4: Samples from CIFAR-100, STL-10, Imagewoof, and ImageNet datasets. Samples in the first row are from CIFAR-100, samples in the second row are from STL-10, samples in the third row are from Imagewoof, and samples in the last row are from ImageNet.
  • Figure 5: The first single-view image contains only the front light feature, while the middle two multi-view images contain both wheel and front light features, and the last single-view image contains only the wheel feature.

Theorems & Definitions (37)

  • Definition 1: Informal, Data distribution allen-zhu2023towards
  • Theorem 4
  • Theorem 5
  • Corollary 6
  • Definition 7: data distributions for single-view $\mathcal{D}_s$ and multi-view data $\mathcal{D}_m$ allen-zhu2023towards
  • Definition 8
  • Theorem 11: Peformance on FixMatch
  • Theorem 13: Performance on SA-FixMatch
  • Proposition 14
  • Definition 17
  • ...and 27 more