Table of Contents
Fetching ...

Quantized Embedding Vectors for Controllable Diffusion Language Models

Cheng Kang, Xinye Chen, Yong Hu, Daniel Novak

TL;DR

This work addresses the high resource demands and limited portability of diffusion language models by proposing QE-CDLM, which remodels the task-specific embedding space through vector quantization and combines diffusion generation with an adaption fine-tuning strategy. The method quantizes embedding vectors to three- or four-state schemes and uses a gradient-based controller to steer generation, aided by fluency regularization and Minimum Bayes Risk decoding. Key contributions include demonstrating faster convergence, reduced tunable parameters (over 90% with LoRA), and improved perplexity and control on five fine-grained tasks, across three datasets, while maintaining competitive fluency. The approach advances practical controllable diffusion LMs by delivering portability, speed, and stability benefits suitable for deployment and broader applicability in controlled text generation scenarios.

Abstract

Improving the controllability, portability, and inference speed of diffusion language models (DLMs) is a key challenge in natural language generation. While recent research has shown significant success in complex text generation with language models, the memory and computational power are still very demanding and fall short of expectations, which naturally results in low portability and instability for the models. To mitigate these issues, numerous well-established methods were proposed for neural network quantization. To further enhance their portability of independent deployment as well as improve their stability evaluated by language perplexity, we propose a novel approach called the Quantized Embedding Controllable Diffusion Language Model (QE-CDLM). QE-CDLM builds upon the recent successful controllable DLMs by remodeling the task-specific embedding space via quantization. This leads to a gradient-based controller for the generation tasks, and more stable intermediate latent variables are obtained, which naturally brings in an accelerated convergence as well as better controllability. Additionally, the adaption fine-tuning method is employed to reduce tunable weights. Experimental results on five challenging fine-grained control tasks demonstrate that QE-CDLM compares favorably to existing methods in terms of quality and feasibility, achieving better perplexity and lightweight fine-tuning.

Quantized Embedding Vectors for Controllable Diffusion Language Models

TL;DR

This work addresses the high resource demands and limited portability of diffusion language models by proposing QE-CDLM, which remodels the task-specific embedding space through vector quantization and combines diffusion generation with an adaption fine-tuning strategy. The method quantizes embedding vectors to three- or four-state schemes and uses a gradient-based controller to steer generation, aided by fluency regularization and Minimum Bayes Risk decoding. Key contributions include demonstrating faster convergence, reduced tunable parameters (over 90% with LoRA), and improved perplexity and control on five fine-grained tasks, across three datasets, while maintaining competitive fluency. The approach advances practical controllable diffusion LMs by delivering portability, speed, and stability benefits suitable for deployment and broader applicability in controlled text generation scenarios.

Abstract

Improving the controllability, portability, and inference speed of diffusion language models (DLMs) is a key challenge in natural language generation. While recent research has shown significant success in complex text generation with language models, the memory and computational power are still very demanding and fall short of expectations, which naturally results in low portability and instability for the models. To mitigate these issues, numerous well-established methods were proposed for neural network quantization. To further enhance their portability of independent deployment as well as improve their stability evaluated by language perplexity, we propose a novel approach called the Quantized Embedding Controllable Diffusion Language Model (QE-CDLM). QE-CDLM builds upon the recent successful controllable DLMs by remodeling the task-specific embedding space via quantization. This leads to a gradient-based controller for the generation tasks, and more stable intermediate latent variables are obtained, which naturally brings in an accelerated convergence as well as better controllability. Additionally, the adaption fine-tuning method is employed to reduce tunable weights. Experimental results on five challenging fine-grained control tasks demonstrate that QE-CDLM compares favorably to existing methods in terms of quality and feasibility, achieving better perplexity and lightweight fine-tuning.
Paper Structure (39 sections, 20 equations, 6 figures, 14 tables)

This paper contains 39 sections, 20 equations, 6 figures, 14 tables.

Figures (6)

  • Figure 1: The proposed method contains two main steps: QE-DLM and Classifier. In the first step, QE-DLM denoises a sequence of quantized Gaussian vectors that are added to word vectors. The quantized embedding vectors then compress and remodel the discrete latent space through a reverse diffusion process. In the second step, the Classifier updates the gradient on the continuous latent space using control. The DLM demonstrates its capability to generate fluent text, and the proper classifier effectively constrains the generated text based on specific control dependence, such as a Parse Tree.
  • Figure 2: A graphical model representing the forward and reverse diffusion processes. Following the existing research in li2022diffusion, a Markov transition is introduced between $x_{0}$ and $w$ to achieve end-to-end training and optimize the discrete space based on a quantization method. The discrete space in $x_t$ will be remodeled with a quantized vector $[-1, -1, ..., 1]$.
  • Figure 3: The schematic diagram of four quantization methods in terms of the backpropagation process. For the gradient, $p$ is the remainder in the integer part, and $q$ is the remainder in the fractional part.
  • Figure 4: The rounding difference between with and without the quantization processing.
  • Figure 5: The structure of the Syntax Tree presented in Table \ref{['Table3-2']}.
  • ...and 1 more figures