Table of Contents
Fetching ...

Deep Generative Clustering with VAEs and Expectation-Maximization

Michael Adipoetra, Ségolène Martin

TL;DR

The paper addresses unsupervised image clustering where traditional Gaussian-prior approaches struggle with multimodal cluster structure. It proposes an EM-inspired framework where each cluster is modeled by its own VAE, with uniform mixing and soft assignments; the objective combines ELBOs from cluster-specific VAEs with an entropy term on the assignments. An E-step updates soft cluster memberships via a softmax of cluster ELBOs, while an M-step updates cluster-specific VAE parameters through weighted ELBO optimization using ADAM, following a GEM-like guarantee of non-decreasing ELBO. Experiments on MNIST and FashionMNIST show superior average clustering accuracy relative to state-of-the-art VAE-based clustering methods and demonstrate clear cluster-specific sample generation, highlighting the method’s practical impact for unsupervised learning and data generation.

Abstract

We propose a novel deep clustering method that integrates Variational Autoencoders (VAEs) into the Expectation-Maximization (EM) framework. Our approach models the probability distribution of each cluster with a VAE and alternates between updating model parameters by maximizing the Evidence Lower Bound (ELBO) of the log-likelihood and refining cluster assignments based on the learned distributions. This enables effective clustering and generation of new samples from each cluster. Unlike existing VAE-based methods, our approach eliminates the need for a Gaussian Mixture Model (GMM) prior or additional regularization techniques. Experiments on MNIST and FashionMNIST demonstrate superior clustering performance compared to state-of-the-art methods.

Deep Generative Clustering with VAEs and Expectation-Maximization

TL;DR

The paper addresses unsupervised image clustering where traditional Gaussian-prior approaches struggle with multimodal cluster structure. It proposes an EM-inspired framework where each cluster is modeled by its own VAE, with uniform mixing and soft assignments; the objective combines ELBOs from cluster-specific VAEs with an entropy term on the assignments. An E-step updates soft cluster memberships via a softmax of cluster ELBOs, while an M-step updates cluster-specific VAE parameters through weighted ELBO optimization using ADAM, following a GEM-like guarantee of non-decreasing ELBO. Experiments on MNIST and FashionMNIST show superior average clustering accuracy relative to state-of-the-art VAE-based clustering methods and demonstrate clear cluster-specific sample generation, highlighting the method’s practical impact for unsupervised learning and data generation.

Abstract

We propose a novel deep clustering method that integrates Variational Autoencoders (VAEs) into the Expectation-Maximization (EM) framework. Our approach models the probability distribution of each cluster with a VAE and alternates between updating model parameters by maximizing the Evidence Lower Bound (ELBO) of the log-likelihood and refining cluster assignments based on the learned distributions. This enables effective clustering and generation of new samples from each cluster. Unlike existing VAE-based methods, our approach eliminates the need for a Gaussian Mixture Model (GMM) prior or additional regularization techniques. Experiments on MNIST and FashionMNIST demonstrate superior clustering performance compared to state-of-the-art methods.
Paper Structure (21 sections, 25 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 21 sections, 25 equations, 2 figures, 2 tables, 1 algorithm.

Figures (2)

  • Figure 1: Training samples (unlabeled) and generated samples after 100, 200, and 1000 EM iterations. Each color represents samples from the same cluster-specific VAE. Our method effectively clusters and generates samples.
  • Figure 2: Generated samples with our method on MNIST (left) and Fashion-MNIST (right). Images in the same row come from the same cluster.

Theorems & Definitions (1)

  • Remark 3.1