Table of Contents
Fetching ...

PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation

Yinghua Yao, Yuangang Pan, Jing Li, Ivor Tsang, Xin Yao

TL;DR

PROUD presents a constrained optimization framework for multi-objective generation that preserves sample quality while steering diffusion-based generation toward the Pareto front across multiple properties. It introduces an adaptive Pareto-guided gradient, combining the diffusion score with property gradients via a dual optimization to ensure Pareto improvements when needed and high data fidelity otherwise. A diversity regularization term further promotes broad Pareto coverage. Empirical results on CIFAR10 images and pOAS protein sequences show PROUD achieves superior Pareto approximation (HV) and sample quality (FID or log-likelihood) compared with strong baselines, illustrating practical benefits for controllable, multi-objective generation in both continuous and discrete data domains.

Abstract

Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation quality (i.e., the quality of generated samples). To address these issues, we formulate a constrained optimization problem. It seeks to optimize generation quality while ensuring that generated samples reside at the Pareto front of multiple property objectives. Such a formulation enables the generation of samples that cannot be further improved simultaneously on the conflicting property functions and preserves good quality of generated samples. Building upon this formulation, we introduce the PaRetO-gUided Diffusion model (PROUD), wherein the gradients in the denoising process are dynamically adjusted to enhance generation quality while the generated samples adhere to Pareto optimality. Experimental evaluations on image generation and protein generation tasks demonstrate that our PROUD consistently maintains superior generation quality while approaching Pareto optimality across multiple property functions compared to various baselines.

PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation

TL;DR

PROUD presents a constrained optimization framework for multi-objective generation that preserves sample quality while steering diffusion-based generation toward the Pareto front across multiple properties. It introduces an adaptive Pareto-guided gradient, combining the diffusion score with property gradients via a dual optimization to ensure Pareto improvements when needed and high data fidelity otherwise. A diversity regularization term further promotes broad Pareto coverage. Empirical results on CIFAR10 images and pOAS protein sequences show PROUD achieves superior Pareto approximation (HV) and sample quality (FID or log-likelihood) compared with strong baselines, illustrating practical benefits for controllable, multi-objective generation in both continuous and discrete data domains.

Abstract

Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation quality (i.e., the quality of generated samples). To address these issues, we formulate a constrained optimization problem. It seeks to optimize generation quality while ensuring that generated samples reside at the Pareto front of multiple property objectives. Such a formulation enables the generation of samples that cannot be further improved simultaneously on the conflicting property functions and preserves good quality of generated samples. Building upon this formulation, we introduce the PaRetO-gUided Diffusion model (PROUD), wherein the gradients in the denoising process are dynamically adjusted to enhance generation quality while the generated samples adhere to Pareto optimality. Experimental evaluations on image generation and protein generation tasks demonstrate that our PROUD consistently maintains superior generation quality while approaching Pareto optimality across multiple property functions compared to various baselines.
Paper Structure (23 sections, 16 equations, 14 figures, 5 tables, 1 algorithm)

This paper contains 23 sections, 16 equations, 14 figures, 5 tables, 1 algorithm.

Figures (14)

  • Figure 1: (a) Diagram of multi-objective generation (best viewed in color). Our multi-objective generation aims to produce samples that simultaneously lie on the Pareto front in the functionality space (Left Panel) and remain within the manifold $\mathcal{X}$ in the sample space (Right Panel), i.e., the green cross. (b) Visualization of the image generation task optimized with two objectives on CIFAR10. Images are directly taken from the original CIFAR10 dataset (see full resolution images in Fig. \ref{['fg:cifar_2obj_pf_img']}), whose objective values lie on the Pareto front, namely, $\{x|x\in X, F(x) =[f_1^{\ast}, f_2^{\ast}] \in F^\ast\}$, where $F^{\ast}$ denotes the points on the Pareto front.
  • Figure 2: Generated images from our PROUD and various baselines on CIFAR10 under two/three conflicting patch-based objectives. The scores under each image refer to its objective values $[f_1(x), f_2(x)]$/$[f_1(x), f_2(x), f_3(x)]$, respectively, where those objective values do not reside on the Pareto front are marked in red.
  • Figure 3: Approximation of Pareto front of various methods on CIFAR10 optimized with two objectives. Each point denotes a generated sample, 1,000 in total, where the coordinate corresponds to its objective values. The depth of color represents sample density, the deeper the higher.
  • Figure 4: Approximation of Pareto front of various methods on CIFAR10 optimized with three objectives. Each point denotes each generated sample, 5,000 in total, where the coordinate corresponds to its objective values. The depth of color represents sample density, the deeper the higher. The values in the brackets are earth mover distances between the generated samples and the ground-truth Pareto solutions. We add this measure to indicate that our generated samples are indeed close to the Pareto front given the 3D visualization.
  • Figure 5: The approximation of Pareto front (i.e., generated protein sequences) of various methods. We cannot visualize the results of $m$-MGD because all its generated protein sequences are invalid, resulting in nonexistent SASA evaluations ($f_1$).
  • ...and 9 more figures

Theorems & Definitions (2)

  • Definition 1: Pareto optimality
  • Definition 2: Pareto stationarity