Table of Contents
Fetching ...

Supervised Score-Based Modeling by Gradient Boosting

Changyuan Zhao, Hongyang Du, Guangyuan Liu, Dusit Niyato

TL;DR

The paper introduces Supervised Score-based Model (SSM), a gradient-boosting framework that uses denoising score matching with Langevin dynamics to enable fast, accurate supervised learning. It formalizes a link between gradient boosting updates and noise-free Langevin steps, and develops end-signal guided switching and refinement-rate control to balance prediction accuracy and inference time, all trained via a conditional score network across noise scales. Empirical results on toy tasks, 10 UCI regression datasets, and CIFAR-10/100 show that SSM often surpasses NGBoost, CARD, and DBT in RMSE and accuracy while delivering faster inference and more stable performance. This work offers a practical, theoretically grounded approach to efficient supervised learning with score-based estimators, providing guidance on parameter choices and sampling strategies for real-world use.

Abstract

Score-based generative models can effectively learn the distribution of data by estimating the gradient of the distribution. Due to the multi-step denoising characteristic, researchers have recently considered combining score-based generative models with the gradient boosting algorithm, a multi-step supervised learning algorithm, to solve supervised learning tasks. However, existing generative model algorithms are often limited by the stochastic nature of the models and the long inference time, impacting prediction performances. Therefore, we propose a Supervised Score-based Model (SSM), which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Via the ablation experiment in selected examples, we demonstrate the outstanding performances of the proposed techniques. Additionally, we compare our model with other probabilistic models, including Natural Gradient Boosting (NGboost), Classification and Regression Diffusion Models (CARD), Diffusion Boosted Trees (DBT), and non-probabilistic GBM models. The experimental results show that our model outperforms existing models in both accuracy and inference time.

Supervised Score-Based Modeling by Gradient Boosting

TL;DR

The paper introduces Supervised Score-based Model (SSM), a gradient-boosting framework that uses denoising score matching with Langevin dynamics to enable fast, accurate supervised learning. It formalizes a link between gradient boosting updates and noise-free Langevin steps, and develops end-signal guided switching and refinement-rate control to balance prediction accuracy and inference time, all trained via a conditional score network across noise scales. Empirical results on toy tasks, 10 UCI regression datasets, and CIFAR-10/100 show that SSM often surpasses NGBoost, CARD, and DBT in RMSE and accuracy while delivering faster inference and more stable performance. This work offers a practical, theoretically grounded approach to efficient supervised learning with score-based estimators, providing guidance on parameter choices and sampling strategies for real-world use.

Abstract

Score-based generative models can effectively learn the distribution of data by estimating the gradient of the distribution. Due to the multi-step denoising characteristic, researchers have recently considered combining score-based generative models with the gradient boosting algorithm, a multi-step supervised learning algorithm, to solve supervised learning tasks. However, existing generative model algorithms are often limited by the stochastic nature of the models and the long inference time, impacting prediction performances. Therefore, we propose a Supervised Score-based Model (SSM), which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Via the ablation experiment in selected examples, we demonstrate the outstanding performances of the proposed techniques. Additionally, we compare our model with other probabilistic models, including Natural Gradient Boosting (NGboost), Classification and Regression Diffusion Models (CARD), Diffusion Boosted Trees (DBT), and non-probabilistic GBM models. The experimental results show that our model outperforms existing models in both accuracy and inference time.

Paper Structure

This paper contains 28 sections, 8 theorems, 49 equations, 2 figures, 15 tables, 2 algorithms.

Key Result

Proposition 1

For an input-target pair $(x_I,y_I) \in \mathcal{D}$, let $y_0$ denote the initial prediction point. After $t$ steps of denoising, the current estimated prediction $y_t$ satisfies, where $r_L = \epsilon / \sigma_L^2$ is the refinement rate.

Figures (2)

  • Figure 1: The scatter plots for toy examples. From left to right: linear regression, quadratic regression, log-log linear regression, log-log cubic regression, and sinusoidal regression. The green and blue points represent the true values and the prediction results generated by 1000 samples, respectively.
  • Figure 2: One batch prediction results corresponding to different denoising steps on UCI-Yacht task by SSM (top) and CARD (bottom). $n$ represents the number of denoising steps. The red and blue points represent the true values and the prediction results generated by 1000 samples respectively.

Theorems & Definitions (12)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • ...and 2 more