Table of Contents
Fetching ...

Diffusion Boosted Trees

Xizewen Han, Mingyuan Zhou

TL;DR

Diffusion Boosted Trees (DBT) is developed, which can be viewed as both a new denoising diffusion generative model parameterized by decision trees and a new boosting algorithm that combines the weak learners into a strong learner of conditional distributions without making explicit parametric assumptions on their density forms.

Abstract

Combining the merits of both denoising diffusion probabilistic models and gradient boosting, the diffusion boosting paradigm is introduced for tackling supervised learning problems. We develop Diffusion Boosted Trees (DBT), which can be viewed as both a new denoising diffusion generative model parameterized by decision trees (one single tree for each diffusion timestep), and a new boosting algorithm that combines the weak learners into a strong learner of conditional distributions without making explicit parametric assumptions on their density forms. We demonstrate through experiments the advantages of DBT over deep neural network-based diffusion models as well as the competence of DBT on real-world regression tasks, and present a business application (fraud detection) of DBT for classification on tabular data with the ability of learning to defer.

Diffusion Boosted Trees

TL;DR

Diffusion Boosted Trees (DBT) is developed, which can be viewed as both a new denoising diffusion generative model parameterized by decision trees and a new boosting algorithm that combines the weak learners into a strong learner of conditional distributions without making explicit parametric assumptions on their density forms.

Abstract

Combining the merits of both denoising diffusion probabilistic models and gradient boosting, the diffusion boosting paradigm is introduced for tackling supervised learning problems. We develop Diffusion Boosted Trees (DBT), which can be viewed as both a new denoising diffusion generative model parameterized by decision trees (one single tree for each diffusion timestep), and a new boosting algorithm that combines the weak learners into a strong learner of conditional distributions without making explicit parametric assumptions on their density forms. We demonstrate through experiments the advantages of DBT over deep neural network-based diffusion models as well as the competence of DBT on real-world regression tasks, and present a business application (fraud detection) of DBT for classification on tabular data with the ability of learning to defer.
Paper Structure (40 sections, 54 equations, 6 figures, 9 tables, 5 algorithms)

This paper contains 40 sections, 54 equations, 6 figures, 9 tables, 5 algorithms.

Figures (6)

  • Figure 1: CARD posterior mean coefficients in Eq. (\ref{['eq:card_post_dist_mean']}) across all timesteps during sampling.
  • Figure 1: OpenML regression tasks.
  • Figure 2: Comparison of DBT (top row) and CARD (bottom row) on toy regression examples.
  • Figure 3: Beeswarm summary plots of SHAP values at six diffusion timesteps.
  • Figure 3: PIW for both majority-vote predicted class labels.
  • ...and 1 more figures