Table of Contents
Fetching ...

Dale meets Langevin: A Multiplicative Denoising Diffusion Model

Nishanth Shetty, Madhava Prasath, Chandra Sekhar Seelamantula

TL;DR

This work proposes a biologically grounded diffusion framework based on geometric Brownian motion (GBM) that yields multiplicative, sign-preserving updates aligned with Dale's law and log-normal synaptic weights. By deriving a reverse-time SDE through a log-transform, the authors obtain a multiplicative sampling rule that matches an exponentiated gradient-descent-like update, linking learning dynamics to generative sampling. They introduce multiplicative score-matching losses (M-ESM and M-DSM) to train a score network under multiplicative noise, and demonstrate image generation on MNIST, Fashion-MNIST, and Kuzushiji-MNIST, with qualitative samples and FID/KID metrics. The approach broadens diffusion-based modeling to non-negative data and multiplicative noise, suggesting potential applications in high-resolution imaging and non-image domains such as financial time-series.

Abstract

Gradient descent has proven to be a powerful and effective technique for optimization in numerous machine learning applications. Recent advances in computational neuroscience have shown that learning in standard gradient descent optimization formulation is not consistent with learning in biological systems. This has opened up interesting avenues for building biologically inspired learning techniques. One such approach is inspired by Dale's law, which states that inhibitory and excitatory synapses do not swap roles during the course of learning. The resulting exponential gradient descent optimization scheme leads to log-normally distributed synaptic weights. Interestingly, the density that satisfies the Fokker-Planck equation corresponding to the stochastic differential equation (SDE) with geometric Brownian motion (GBM) is the log-normal density. Leveraging this connection, we start with the SDE governing geometric Brownian motion, and show that discretizing the corresponding reverse-time SDE yields a multiplicative update rule, which surprisingly, coincides with the sampling equivalent of the exponential gradient descent update founded on Dale's law. Furthermore, we propose a new formalism for multiplicative denoising score-matching, subsuming the loss function proposed by Hyvaerinen for non-negative data. Indeed, log-normally distributed data is positive and the proposed score-matching formalism turns out to be a natural fit. This allows for training of score-based models for image data and results in a novel multiplicative update scheme for sample generation starting from a log-normal density. Experimental results on MNIST, Fashion MNIST, and Kuzushiji datasets demonstrate generative capability of the new scheme. To the best of our knowledge, this is the first instance of a biologically inspired generative model employing multiplicative updates, founded on geometric Brownian motion.

Dale meets Langevin: A Multiplicative Denoising Diffusion Model

TL;DR

This work proposes a biologically grounded diffusion framework based on geometric Brownian motion (GBM) that yields multiplicative, sign-preserving updates aligned with Dale's law and log-normal synaptic weights. By deriving a reverse-time SDE through a log-transform, the authors obtain a multiplicative sampling rule that matches an exponentiated gradient-descent-like update, linking learning dynamics to generative sampling. They introduce multiplicative score-matching losses (M-ESM and M-DSM) to train a score network under multiplicative noise, and demonstrate image generation on MNIST, Fashion-MNIST, and Kuzushiji-MNIST, with qualitative samples and FID/KID metrics. The approach broadens diffusion-based modeling to non-negative data and multiplicative noise, suggesting potential applications in high-resolution imaging and non-image domains such as financial time-series.

Abstract

Gradient descent has proven to be a powerful and effective technique for optimization in numerous machine learning applications. Recent advances in computational neuroscience have shown that learning in standard gradient descent optimization formulation is not consistent with learning in biological systems. This has opened up interesting avenues for building biologically inspired learning techniques. One such approach is inspired by Dale's law, which states that inhibitory and excitatory synapses do not swap roles during the course of learning. The resulting exponential gradient descent optimization scheme leads to log-normally distributed synaptic weights. Interestingly, the density that satisfies the Fokker-Planck equation corresponding to the stochastic differential equation (SDE) with geometric Brownian motion (GBM) is the log-normal density. Leveraging this connection, we start with the SDE governing geometric Brownian motion, and show that discretizing the corresponding reverse-time SDE yields a multiplicative update rule, which surprisingly, coincides with the sampling equivalent of the exponential gradient descent update founded on Dale's law. Furthermore, we propose a new formalism for multiplicative denoising score-matching, subsuming the loss function proposed by Hyvaerinen for non-negative data. Indeed, log-normally distributed data is positive and the proposed score-matching formalism turns out to be a natural fit. This allows for training of score-based models for image data and results in a novel multiplicative update scheme for sample generation starting from a log-normal density. Experimental results on MNIST, Fashion MNIST, and Kuzushiji datasets demonstrate generative capability of the new scheme. To the best of our knowledge, this is the first instance of a biologically inspired generative model employing multiplicative updates, founded on geometric Brownian motion.

Paper Structure

This paper contains 28 sections, 2 theorems, 38 equations, 11 figures, 1 table, 2 algorithms.

Key Result

Theorem 5.1

Under standard assumptions on the density and the score function Hyvarinen05asong2019sliced over the positive orthant ${\mathbb{R}}_{+}^{d}$, the multiplicative explicit score-matching (M-ESM) loss given in Eq. eq:mesm and multiplicative denoising score-matching (M-DSM) loss given in Eq. eq:mdsm are

Figures (11)

  • Figure 1: The forward and reverse-time SDEs for Geometric Brownian Motion (GBM). The forward SDE describes the evolution of a clean image sample to a noisy one that eventually becomes log-normally distributed, while the reverse-time SDE captures the dynamics of the process and generates new samples from the unknown density starting from log-normal noise. This is enabled by the knowledge of the unknown density manifesting through the score function.
  • Figure 2: Uncurated sample images generated from MNIST, Fashion-MNIST and Kuzushiji MNIST datasets, corresponding to the score model with minimum score-matching loss during training.
  • Figure 3: The samples have high diversity and the model even generates samples that are not present in the training data but have semantic similarity to the training data.
  • Figure 4: Generated Kuzushiji samples. The generated samples are sufficiently diverse and sharp and distinct from the training data.
  • Figure 5: Generated Fashion MNIST samples. We observe less diversity of the generated samples here compared to MNIST and Kuzushiji MNIST possibly due to the complexity of the training data.
  • ...and 6 more figures

Theorems & Definitions (3)

  • Theorem 5.1: Multiplicative Denoising Score-Matching
  • Theorem C.1: Multiplicative Denoising Score-Matching
  • proof