Table of Contents
Fetching ...

Unsupervised End-to-End Training with a Self-Defined Target

Dongshu Liu, Jérémie Laydevant, Adrien Pontlevy, Damien Querlioz, Julie Grollier

TL;DR

This work proposes a method enabling networks or hardware designed for end-to-end supervised learning to also perform high-performance unsupervised learning by adding two simple elements to the output layer: winner-take-all selectivity and homeostasis regularization.

Abstract

Designing algorithms for versatile AI hardware that can learn on the edge using both labeled and unlabeled data is challenging. Deep end-to-end training methods incorporating phases of self-supervised and supervised learning are accurate and adaptable to input data but self-supervised learning requires even more computational and memory resources than supervised learning, too high for current embedded hardware. Conversely, unsupervised layer-by-layer training, such as Hebbian learning, is more compatible with existing hardware but does not integrate well with supervised learning. To address this, we propose a method enabling networks or hardware designed for end-to-end supervised learning to also perform high-performance unsupervised learning by adding two simple elements to the output layer: Winner-Take-All (WTA) selectivity and homeostasis regularization. These mechanisms introduce a "self-defined target" for unlabeled data, allowing purely unsupervised training for both fully-connected and convolutional layers using backpropagation or equilibrium propagation on datasets like MNIST (up to 99.2%), Fashion-MNIST (up to 90.3%), and SVHN (up to 81.5%). We extend this method to semi-supervised learning, adjusting targets based on data type, achieving 96.6% accuracy with only 600 labeled MNIST samples in a multi-layer perceptron. Our results show that this approach can effectively enable networks and hardware initially dedicated to supervised learning to also perform unsupervised learning, adapting to varying availability of labeled data.

Unsupervised End-to-End Training with a Self-Defined Target

TL;DR

This work proposes a method enabling networks or hardware designed for end-to-end supervised learning to also perform high-performance unsupervised learning by adding two simple elements to the output layer: winner-take-all selectivity and homeostasis regularization.

Abstract

Designing algorithms for versatile AI hardware that can learn on the edge using both labeled and unlabeled data is challenging. Deep end-to-end training methods incorporating phases of self-supervised and supervised learning are accurate and adaptable to input data but self-supervised learning requires even more computational and memory resources than supervised learning, too high for current embedded hardware. Conversely, unsupervised layer-by-layer training, such as Hebbian learning, is more compatible with existing hardware but does not integrate well with supervised learning. To address this, we propose a method enabling networks or hardware designed for end-to-end supervised learning to also perform high-performance unsupervised learning by adding two simple elements to the output layer: Winner-Take-All (WTA) selectivity and homeostasis regularization. These mechanisms introduce a "self-defined target" for unlabeled data, allowing purely unsupervised training for both fully-connected and convolutional layers using backpropagation or equilibrium propagation on datasets like MNIST (up to 99.2%), Fashion-MNIST (up to 90.3%), and SVHN (up to 81.5%). We extend this method to semi-supervised learning, adjusting targets based on data type, achieving 96.6% accuracy with only 600 labeled MNIST samples in a multi-layer perceptron. Our results show that this approach can effectively enable networks and hardware initially dedicated to supervised learning to also perform unsupervised learning, adapting to varying availability of labeled data.
Paper Structure (26 sections, 12 equations, 13 figures, 10 tables, 1 algorithm)

This paper contains 26 sections, 12 equations, 13 figures, 10 tables, 1 algorithm.

Figures (13)

  • Figure 1: (A) Self-supervised End-to-end training methods, here illustrated with a contrastive global loss as training objective. (B) Layer-wise training approaches typically used for unsupervised learning algorithms with local learning rules. (C) Our approach to unsupervised end-to-end using an unsupervised global loss defined at the network output.
  • Figure 2: Test accuracies achieved using our Unsupervised End-to-End Training Method for weight updates, utilizing Backpropagation and Equilibrium Propagation, applied to fully-connected and convolutional networks across the MNIST, Fashion-MNIST, and SVHN datasets.
  • Figure 3: Unsupervised learning: (A) Network architecture and (B) Training-Testing process
  • Figure 4: Test accuracy on MNIST for unsupervised learning, as a function of the percentage of labeled data used for class association: (A) with direct association; (B) with a linear classifier. Green dotted line: one-layer network trained by unsupervised BP, blue dotted line: one-layer network trained by unsupervised EP. Pink lines: untrained networks, including one-layer network (dotted line) and two-layer network (solid line). The single-layer network has 2,000 output neurons, while the two-layer version adds 2,000 hidden neurons.
  • Figure 5: Test accuracy on MNIST for unsupervised training as a function of the amount of labeled data used for class association: (A) with direct association; (B) with a linear classifier. The two-layer network trained by unsupervised BP (green solid line) and EP (blue solid line) is compared with the one-layer unsupervised trained network (dotted lines). The single-layer network has 2,000 output neurons, while the two-layer version adds 2,000 hidden neurons.
  • ...and 8 more figures