Table of Contents
Fetching ...

Rectified Noise: A Generative Model Using Positive-incentive Noise

Zhenyu Gu, Yanchen Xu, Sida Huang, Yubin Guo, Hongyuan Zhang

TL;DR

This work enhances Rectified Flow by integrating Positive-incentive Noise into the velocity field, forming Rectified Noise (RN) that converts pre-trained RF into pi-noise generators. By defining task entropy through RF loss and learning a pi-noise predictor $\boldsymbol{\epsilon}_\theta$ (with options to train jointly or fine-tune), RN achieves improved sample quality with modest parameter overhead. Empirical results on ImageNet-1k, AFHQ, and CelebA-HQ show consistent FID improvements (up to $1.11$, $1.89$, and $3.52$, respectively) while Gaussian noise typically performs best among considered distributions. The approach offers a practical, efficient route to boost RF-based generative models and suggests broader applicability of pi-noise to flow-based and interpolant-based generative frameworks.

Abstract

Rectified Flow (RF) has been widely used as an effective generative model. Although RF is primarily based on probability flow Ordinary Differential Equations (ODE), recent studies have shown that injecting noise through reverse-time Stochastic Differential Equations (SDE) for sampling can achieve superior generative performance. Inspired by Positive-incentive Noise (pi-noise), we propose an innovative generative algorithm to train pi-noise generators, namely Rectified Noise (RN), which improves the generative performance by injecting pi-noise into the velocity field of pre-trained RF models. After introducing the Rectified Noise pipeline, pre-trained RF models can be efficiently transformed into pi-noise generators. We validate Rectified Noise by conducting extensive experiments across various model architectures on different datasets. Notably, we find that: (1) RF models using Rectified Noise reduce FID from 10.16 to 9.05 on ImageNet-1k. (2) The models of pi-noise generators achieve improved performance with only 0.39% additional training parameters.

Rectified Noise: A Generative Model Using Positive-incentive Noise

TL;DR

This work enhances Rectified Flow by integrating Positive-incentive Noise into the velocity field, forming Rectified Noise (RN) that converts pre-trained RF into pi-noise generators. By defining task entropy through RF loss and learning a pi-noise predictor (with options to train jointly or fine-tune), RN achieves improved sample quality with modest parameter overhead. Empirical results on ImageNet-1k, AFHQ, and CelebA-HQ show consistent FID improvements (up to , , and , respectively) while Gaussian noise typically performs best among considered distributions. The approach offers a practical, efficient route to boost RF-based generative models and suggests broader applicability of pi-noise to flow-based and interpolant-based generative frameworks.

Abstract

Rectified Flow (RF) has been widely used as an effective generative model. Although RF is primarily based on probability flow Ordinary Differential Equations (ODE), recent studies have shown that injecting noise through reverse-time Stochastic Differential Equations (SDE) for sampling can achieve superior generative performance. Inspired by Positive-incentive Noise (pi-noise), we propose an innovative generative algorithm to train pi-noise generators, namely Rectified Noise (RN), which improves the generative performance by injecting pi-noise into the velocity field of pre-trained RF models. After introducing the Rectified Noise pipeline, pre-trained RF models can be efficiently transformed into pi-noise generators. We validate Rectified Noise by conducting extensive experiments across various model architectures on different datasets. Notably, we find that: (1) RF models using Rectified Noise reduce FID from 10.16 to 9.05 on ImageNet-1k. (2) The models of pi-noise generators achieve improved performance with only 0.39% additional training parameters.

Paper Structure

This paper contains 27 sections, 22 equations, 5 figures, 5 tables, 2 algorithms.

Figures (5)

  • Figure 1: Image results of RF models using $\Delta$RN. Sampling with $\Delta$RN improves natural image generation. The images without red highlight show the generation of the standard RF model. The images outlined in red present the result using $\Delta$RN. Here we show comparisons between images generated by SiT models trained on ImageNet-1k (256 × 256) and SiT models using $\Delta$RN.
  • Figure 2: Overview of Rectified Noise pipeline.(a) The Rectified Noise model inherits pre-trained knowledge from a foundation model (RF) through parameter freezing. Additional and trainable SiT blocks are integrated to predict $\pi$-noise. (b) Inference of traditional RF models. (c) Inference with the Rectified Noise involves adding $\pi$-noise to the predicted velocity field.
  • Figure 3: Visualization of the $\pi$-noise by $\Delta$RN. The first line shows the original image generation with RF model, the second line shows the results of RF models using $\Delta$RN in one step, the third line shows the generated noise for one step and the fourth line shows the cumulative noise for each time step. We use 180 steps for visualization.
  • Figure 4: Training FID comparison for SiT-B/2 and SiT-B/2+$\Delta$RN. The SiT B/2 + $\Delta$RN model converges slower than the SiT B/2 model.
  • Figure 5: Visualization of the $\pi$-noise by $\Delta$RN. The first line shows the original image generation with RF model, the second line shows the results of RF models using $\Delta$RN in one step, the third line shows the generated noise for one step and the fourth line shows the cumulative noise for each time step. We use 180 steps for visualization.