Table of Contents
Fetching ...

Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation

YoungJoon Yoo, Jongwon Choi

TL;DR

The paper addresses topic modeling using latent codebooks from VQ-VAE to capture topic context from pre-trained embeddings. It introduces TVQ-VAE, which treats codebooks and their embeddings as conceptual words, enabling both BoW-style document generation and general autoregressive generation for images. Empirical results on 20NG and NYT show competitive topic quality, while experiments on CIFAR-10 and CelebA demonstrate effective topic-guided image generation and reference-based conditioning. The approach offers a flexible, general probabilistic framework for topic-guided sampling and points toward multi-modal extensions in future work.

Abstract

This paper introduces a novel approach for topic modeling utilizing latent codebooks from Vector-Quantized Variational Auto-Encoder~(VQ-VAE), discretely encapsulating the rich information of the pre-trained embeddings such as the pre-trained language model. From the novel interpretation of the latent codebooks and embeddings as conceptual bag-of-words, we propose a new generative topic model called Topic-VQ-VAE~(TVQ-VAE) which inversely generates the original documents related to the respective latent codebook. The TVQ-VAE can visualize the topics with various generative distributions including the traditional BoW distribution and the autoregressive image generation. Our experimental results on document analysis and image generation demonstrate that TVQ-VAE effectively captures the topic context which reveals the underlying structures of the dataset and supports flexible forms of document generation. Official implementation of the proposed TVQ-VAE is available at https://github.com/clovaai/TVQ-VAE.

Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation

TL;DR

The paper addresses topic modeling using latent codebooks from VQ-VAE to capture topic context from pre-trained embeddings. It introduces TVQ-VAE, which treats codebooks and their embeddings as conceptual words, enabling both BoW-style document generation and general autoregressive generation for images. Empirical results on 20NG and NYT show competitive topic quality, while experiments on CIFAR-10 and CelebA demonstrate effective topic-guided image generation and reference-based conditioning. The approach offers a flexible, general probabilistic framework for topic-guided sampling and points toward multi-modal extensions in future work.

Abstract

This paper introduces a novel approach for topic modeling utilizing latent codebooks from Vector-Quantized Variational Auto-Encoder~(VQ-VAE), discretely encapsulating the rich information of the pre-trained embeddings such as the pre-trained language model. From the novel interpretation of the latent codebooks and embeddings as conceptual bag-of-words, we propose a new generative topic model called Topic-VQ-VAE~(TVQ-VAE) which inversely generates the original documents related to the respective latent codebook. The TVQ-VAE can visualize the topics with various generative distributions including the traditional BoW distribution and the autoregressive image generation. Our experimental results on document analysis and image generation demonstrate that TVQ-VAE effectively captures the topic context which reveals the underlying structures of the dataset and supports flexible forms of document generation. Official implementation of the proposed TVQ-VAE is available at https://github.com/clovaai/TVQ-VAE.
Paper Structure (31 sections, 13 equations, 6 figures, 5 tables, 2 algorithms)

This paper contains 31 sections, 13 equations, 6 figures, 5 tables, 2 algorithms.

Figures (6)

  • Figure 1: Graphical representation of the TVQ-VAE. Diagrams (a) and (b) illustrate the TVQ-VAE's graphical representation in both BoW and General forms, while diagram (c) presents an example of vector quantized embedding, conceptual word, and output. Notably, the encoder network is fixed in our method.
  • Figure 2: The quantitative evaluation of topic quality over two datasets: 20NG and NYT. The baseline methods are listed from Left to right: LDA, ProdLDA (PLDA), ETM, BerTopic, and TVQ-VAE.
  • Figure 3: Demonstration of the TQ over various numbers of codebook $\{100, 200, 300\}$ and expansion $k=\{1,3,5\}$.
  • Figure 4: Illustrations of visualized topics and reference-based generation for topic number $K$ of $100$, from TVQ-VAE (P).
  • Figure 5: Illustrations of reference-based generation applying TVQ-VAE (T) for topic number $K$ of $100$.
  • ...and 1 more figures