Table of Contents
Fetching ...

eVAE: Evolutionary Variational Autoencoder

Zhangkai Wu, Longbing Cao, Lei Qi

TL;DR

eVAE addresses the persistent tradeoff between reconstruction quality and latent inference robustness in VAEs by embedding a variational genetic algorithm into a VIB-guided learning framework. An outer VGA continuously evolves the KL weight $\beta$ via SBX-based crossover and Cauchy-based mutation, guided by a fitness that combines ELBO improvement and proximity to a target information bottleneck, yielding an iteration-specific lower bound $-\mathcal{E}(\beta_t)I_t - \mathcal{E}(\beta_t)D_t$. Theoretical analysis connects this approach to rate-distortion theory and VIB, explaining dynamic optimization benefits, while experiments on dSprites, CelebA, and PTB show improved disentanglement, sharper image reconstruction, and mitigation of KL vanishing without constraining architecture. This framework offers a flexible, optimization-driven path to balanced generative learning across domains.

Abstract

The surrogate loss of variational autoencoders (VAEs) poses various challenges to their training, inducing the imbalance between task fitting and representation inference. To avert this, the existing strategies for VAEs focus on adjusting the tradeoff by introducing hyperparameters, deriving a tighter bound under some mild assumptions, or decomposing the loss components per certain neural settings. VAEs still suffer from uncertain tradeoff learning.We propose a novel evolutionary variational autoencoder (eVAE) building on the variational information bottleneck (VIB) theory and integrative evolutionary neural learning. eVAE integrates a variational genetic algorithm into VAE with variational evolutionary operators including variational mutation, crossover, and evolution. Its inner-outer-joint training mechanism synergistically and dynamically generates and updates the uncertain tradeoff learning in the evidence lower bound (ELBO) without additional constraints. Apart from learning a lossy compression and representation of data under the VIB assumption, eVAE presents an evolutionary paradigm to tune critical factors of VAEs and deep neural networks and addresses the premature convergence and random search problem by integrating evolutionary optimization into deep learning. Experiments show that eVAE addresses the KL-vanishing problem for text generation with low reconstruction loss, generates all disentangled factors with sharp images, and improves the image generation quality,respectively. eVAE achieves better reconstruction loss, disentanglement, and generation-inference balance than its competitors.

eVAE: Evolutionary Variational Autoencoder

TL;DR

eVAE addresses the persistent tradeoff between reconstruction quality and latent inference robustness in VAEs by embedding a variational genetic algorithm into a VIB-guided learning framework. An outer VGA continuously evolves the KL weight via SBX-based crossover and Cauchy-based mutation, guided by a fitness that combines ELBO improvement and proximity to a target information bottleneck, yielding an iteration-specific lower bound . Theoretical analysis connects this approach to rate-distortion theory and VIB, explaining dynamic optimization benefits, while experiments on dSprites, CelebA, and PTB show improved disentanglement, sharper image reconstruction, and mitigation of KL vanishing without constraining architecture. This framework offers a flexible, optimization-driven path to balanced generative learning across domains.

Abstract

The surrogate loss of variational autoencoders (VAEs) poses various challenges to their training, inducing the imbalance between task fitting and representation inference. To avert this, the existing strategies for VAEs focus on adjusting the tradeoff by introducing hyperparameters, deriving a tighter bound under some mild assumptions, or decomposing the loss components per certain neural settings. VAEs still suffer from uncertain tradeoff learning.We propose a novel evolutionary variational autoencoder (eVAE) building on the variational information bottleneck (VIB) theory and integrative evolutionary neural learning. eVAE integrates a variational genetic algorithm into VAE with variational evolutionary operators including variational mutation, crossover, and evolution. Its inner-outer-joint training mechanism synergistically and dynamically generates and updates the uncertain tradeoff learning in the evidence lower bound (ELBO) without additional constraints. Apart from learning a lossy compression and representation of data under the VIB assumption, eVAE presents an evolutionary paradigm to tune critical factors of VAEs and deep neural networks and addresses the premature convergence and random search problem by integrating evolutionary optimization into deep learning. Experiments show that eVAE addresses the KL-vanishing problem for text generation with low reconstruction loss, generates all disentangled factors with sharp images, and improves the image generation quality,respectively. eVAE achieves better reconstruction loss, disentanglement, and generation-inference balance than its competitors.
Paper Structure (22 sections, 32 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 22 sections, 32 equations, 8 figures, 1 table, 1 algorithm.

Figures (8)

  • Figure 1: The framework of eVAE. The VAE results inform the chromosome sampling. The genes are then updated by variational V-crossover and V-mutation. The evolved results are checked per fitness for $t+1$ retraining, giving up, or converging.
  • Figure 2: eVAE - inner-outer joint evolutionary training process. The upper part shows the VAE training at time $t$, the objectives are then incorporated into the lower part - the outer training by variational genetic algorithm, whose fitness-based optimized results are fed to the VAE for further training.
  • Figure 3: The information plane with the $R-D$ curves of VAE, $\beta$-VAE, ControlVAE and eVAE on dSprites.
  • Figure 4: Performance comparison of different VAEs for image generation on CelebA.
  • Figure 5: Learning curves on dSprites. (a,b) indicate that eVAE has the lowest reconstruction loss compared with VAE ($\beta = 1$), $\beta$-VAE ($\beta = 4$), and ControlVAE (KL = 19) under a fixed KL point KL=19. (c) is the element-wise KL divergence as a function of iterations in eVAE. We can observe that eVAE retains a stable and reasonable KL divergence of each generator dimension (factor): position-$y$ (z2), scale (z3), shape (z4), position-$x$ (z6), orientation (z7). More comparisons in terms of generated KL divergence can be found in Supplementary Material E.More comparisons in terms of generated KL divergence can be found in Supplementary Material E.
  • ...and 3 more figures