Supervised Score-Based Modeling by Gradient Boosting
Changyuan Zhao, Hongyang Du, Guangyuan Liu, Dusit Niyato
TL;DR
The paper introduces Supervised Score-based Model (SSM), a gradient-boosting framework that uses denoising score matching with Langevin dynamics to enable fast, accurate supervised learning. It formalizes a link between gradient boosting updates and noise-free Langevin steps, and develops end-signal guided switching and refinement-rate control to balance prediction accuracy and inference time, all trained via a conditional score network across noise scales. Empirical results on toy tasks, 10 UCI regression datasets, and CIFAR-10/100 show that SSM often surpasses NGBoost, CARD, and DBT in RMSE and accuracy while delivering faster inference and more stable performance. This work offers a practical, theoretically grounded approach to efficient supervised learning with score-based estimators, providing guidance on parameter choices and sampling strategies for real-world use.
Abstract
Score-based generative models can effectively learn the distribution of data by estimating the gradient of the distribution. Due to the multi-step denoising characteristic, researchers have recently considered combining score-based generative models with the gradient boosting algorithm, a multi-step supervised learning algorithm, to solve supervised learning tasks. However, existing generative model algorithms are often limited by the stochastic nature of the models and the long inference time, impacting prediction performances. Therefore, we propose a Supervised Score-based Model (SSM), which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Via the ablation experiment in selected examples, we demonstrate the outstanding performances of the proposed techniques. Additionally, we compare our model with other probabilistic models, including Natural Gradient Boosting (NGboost), Classification and Regression Diffusion Models (CARD), Diffusion Boosted Trees (DBT), and non-probabilistic GBM models. The experimental results show that our model outperforms existing models in both accuracy and inference time.
