Table of Contents
Fetching ...

U-TELL: Unsupervised Task Expert Lifelong Learning

Indu Solomon, Aye Phyu Phyu Aung, Uttam Kumar, Senthilnath Jayavelu

TL;DR

U-TELL tackles unsupervised continual learning with unknown task boundaries by introducing a structure-growing modular architecture that allocates a new Task Expert (TE) for each incoming task and uses a Structured Data Generator (SDG) to synthesize task-consistent samples without storing raw data. A Task Assigner (TA) then routes test samples to the most suitable TE, using cross-entropy for Class-IL scenarios and cosine similarity for Domain-IL, while losses on the TE, the synthetic data structure, and the TA drive learning. The TE stack comprises a beta-VAE for distribution learning, a latent-space $k$-means clustering module, and a structure extractor to preserve latent task signatures, with task-distribution covariance $Q_t$ stored in memory. Empirical results across eight benchmarks and an industry wafer-defect dataset show U-TELL achieving state-of-the-art accuracy and up to 6x faster training, while maintaining memory efficiency thanks to latent-signature storage and replay-free learning. This approach promises scalable, interference-free continual learning in real-world, label-scarce environments.

Abstract

Continual learning (CL) models are designed to learn new tasks arriving sequentially without re-training the network. However, real-world ML applications have very limited label information and these models suffer from catastrophic forgetting. To address these issues, we propose an unsupervised CL model with task experts called Unsupervised Task Expert Lifelong Learning (U-TELL) to continually learn the data arriving in a sequence addressing catastrophic forgetting. During training of U-TELL, we introduce a new expert on arrival of a new task. Our proposed architecture has task experts, a structured data generator and a task assigner. Each task expert is composed of 3 blocks; i) a variational autoencoder to capture the task distribution and perform data abstraction, ii) a k-means clustering module, and iii) a structure extractor to preserve latent task data signature. During testing, task assigner selects a suitable expert to perform clustering. U-TELL does not store or replay task samples, instead, we use generated structured samples to train the task assigner. We compared U-TELL with five SOTA unsupervised CL methods. U-TELL outperformed all baselines on seven benchmarks and one industry dataset for various CL scenarios with a training time over 6 times faster than the best performing baseline.

U-TELL: Unsupervised Task Expert Lifelong Learning

TL;DR

U-TELL tackles unsupervised continual learning with unknown task boundaries by introducing a structure-growing modular architecture that allocates a new Task Expert (TE) for each incoming task and uses a Structured Data Generator (SDG) to synthesize task-consistent samples without storing raw data. A Task Assigner (TA) then routes test samples to the most suitable TE, using cross-entropy for Class-IL scenarios and cosine similarity for Domain-IL, while losses on the TE, the synthetic data structure, and the TA drive learning. The TE stack comprises a beta-VAE for distribution learning, a latent-space -means clustering module, and a structure extractor to preserve latent task signatures, with task-distribution covariance stored in memory. Empirical results across eight benchmarks and an industry wafer-defect dataset show U-TELL achieving state-of-the-art accuracy and up to 6x faster training, while maintaining memory efficiency thanks to latent-signature storage and replay-free learning. This approach promises scalable, interference-free continual learning in real-world, label-scarce environments.

Abstract

Continual learning (CL) models are designed to learn new tasks arriving sequentially without re-training the network. However, real-world ML applications have very limited label information and these models suffer from catastrophic forgetting. To address these issues, we propose an unsupervised CL model with task experts called Unsupervised Task Expert Lifelong Learning (U-TELL) to continually learn the data arriving in a sequence addressing catastrophic forgetting. During training of U-TELL, we introduce a new expert on arrival of a new task. Our proposed architecture has task experts, a structured data generator and a task assigner. Each task expert is composed of 3 blocks; i) a variational autoencoder to capture the task distribution and perform data abstraction, ii) a k-means clustering module, and iii) a structure extractor to preserve latent task data signature. During testing, task assigner selects a suitable expert to perform clustering. U-TELL does not store or replay task samples, instead, we use generated structured samples to train the task assigner. We compared U-TELL with five SOTA unsupervised CL methods. U-TELL outperformed all baselines on seven benchmarks and one industry dataset for various CL scenarios with a training time over 6 times faster than the best performing baseline.
Paper Structure (8 sections, 13 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 8 sections, 13 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Structure of U-TELL architecture: (a) Block diagram; (b) Detailed diagram of each task expert; (c) Structured data generator in detail; (d) Input and output of the task assigner.
  • Figure 2: Block diagram of the testing phase of the proposed U-TELL.
  • Figure 3: (a) Comparison of individual task performance of U-TELL with the baselines for PMNIST dataset; (b) Comparison of individual task performance of U-TELL with the baselines for SSVHN dataset; (c) U-TELL ablation study conducted by varying the cluster numbers for SSVHN and RMNIST dataset.