Interpolation Consistency Training for Semi-Supervised Learning

Vikas Verma; Kenji Kawaguchi; Alex Lamb; Juho Kannala; Arno Solin; Yoshua Bengio; David Lopez-Paz

Interpolation Consistency Training for Semi-Supervised Learning

Vikas Verma, Kenji Kawaguchi, Alex Lamb, Juho Kannala, Arno Solin, Yoshua Bengio, David Lopez-Paz

TL;DR

ICT addresses semi-supervised learning by enforcing prediction consistency at interpolations between unlabeled samples, leveraging a mean-teacher framework and mixup-style connections. The method achieves state-of-the-art or competitive results on CIFAR-10, SVHN, and CIFAR-100 with computation-efficient training and modest hyperparameter tuning. The authors provide a theoretical account showing ICT acts as a regularizer on higher-order derivatives, and that high-confidence unlabeled predictions help suppress overfitting at labeled points, supported by empirical ablations. Overall, ICT offers a practical, scalable SSL paradigm with strong empirical gains and a principled derivative-regularization interpretation, with future work pointing toward interpolations in hidden representations.

Abstract

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density regions of the data distribution. Our experiments show that ICT achieves state-of-the-art performance when applied to standard neural network architectures on the CIFAR-10 and SVHN benchmark datasets. Our theoretical analysis shows that ICT corresponds to a certain type of data-adaptive regularization with unlabeled points which reduces overfitting to labeled points under high confidence values.

Interpolation Consistency Training for Semi-Supervised Learning

TL;DR

Abstract

Interpolation Consistency Training for Semi-Supervised Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)