Cooperative Training of Descriptor and Generator Networks

Jianwen Xie; Yang Lu; Ruiqi Gao; Song-Chun Zhu; Ying Nian Wu

Cooperative Training of Descriptor and Generator Networks

Jianwen Xie, Yang Lu, Ruiqi Gao, Song-Chun Zhu, Ying Nian Wu

TL;DR

CoopNets introduces a cooperative framework to jointly train a bottom-up descriptor energy-based network and a top-down generator latent-variable network using MCMC teaching. The generator provides initial synthesized samples that the descriptor refines via finite-step Langevin dynamics, while the descriptor's revisions guide the generator to reproduce those refinements, effectively unifying energy-based and latent-variable learning. Across textures, objects, scenes, digits, and dynamic textures, CoopNets yields highly realistic synthesis and robust pattern completion, often outperforming GAN- and VAE-based baselines. The approach offers a new perspective on combining undirected and directed models and suggests natural extensions to conditional generation.

Abstract

This paper studies the cooperative training of two generative models for image modeling and synthesis. Both models are parametrized by convolutional neural networks (ConvNets). The first model is a deep energy-based model, whose energy function is defined by a bottom-up ConvNet, which maps the observed image to the energy. We call it the descriptor network. The second model is a generator network, which is a non-linear version of factor analysis. It is defined by a top-down ConvNet, which maps the latent factors to the observed image. The maximum likelihood learning algorithms of both models involve MCMC sampling such as Langevin dynamics. We observe that the two learning algorithms can be seamlessly interwoven into a cooperative learning algorithm that can train both models simultaneously. Specifically, within each iteration of the cooperative learning algorithm, the generator model generates initial synthesized examples to initialize a finite-step MCMC that samples and trains the energy-based descriptor model. After that, the generator model learns from how the MCMC changes its synthesized examples. That is, the descriptor model teaches the generator model by MCMC, so that the generator model accumulates the MCMC transitions and reproduces them by direct ancestral sampling. We call this scheme MCMC teaching. We show that the cooperative algorithm can learn highly realistic generative models.

Cooperative Training of Descriptor and Generator Networks

TL;DR

Abstract

Cooperative Training of Descriptor and Generator Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (17)