Table of Contents
Fetching ...

NRGBoost: Energy-Based Generative Boosted Trees

João Bravo

TL;DR

NRGBoost delivers an energy-based generative boosting approach for tabular data by learning a density q_f(x) ∝ exp(f(x)) with normalization Z[f], and optimizing a quadratic approximation to the log-likelihood at each boosting step. It uses piecewise-constant trees as weak learners and introduces amortized Gibbs sampling to manage the computational cost, including a rejection-sampling scheme that reuses samples across boosting rounds. Empirically, NRGBoost achieves discriminative performance close to gradient-boosted decision trees on real datasets and delivers competitive samples compared to neural generative models, while enabling principled marginalization for missing data and flexible conditional sampling. The work highlights the viability of tree-based density modeling with energy-based boosting and discusses practical trade-offs in training, sampling efficiency, and applicability to density-based inference tasks on tabular data.

Abstract

Despite the rise to dominance of deep learning in unstructured data domains, tree-based methods such as Random Forests (RF) and Gradient Boosted Decision Trees (GBDT) are still the workhorses for handling discriminative tasks on tabular data. We explore generative extensions of these popular algorithms with a focus on explicitly modeling the data density (up to a normalization constant), thus enabling other applications besides sampling. As our main contribution we propose an energy-based generative boosting algorithm that is analogous to the second-order boosting implemented in popular libraries like XGBoost. We show that, despite producing a generative model capable of handling inference tasks over any input variable, our proposed algorithm can achieve similar discriminative performance to GBDT on a number of real world tabular datasets, outperforming alternative generative approaches. At the same time, we show that it is also competitive with neural-network-based models for sampling. Code is available at https://github.com/ajoo/nrgboost.

NRGBoost: Energy-Based Generative Boosted Trees

TL;DR

NRGBoost delivers an energy-based generative boosting approach for tabular data by learning a density q_f(x) ∝ exp(f(x)) with normalization Z[f], and optimizing a quadratic approximation to the log-likelihood at each boosting step. It uses piecewise-constant trees as weak learners and introduces amortized Gibbs sampling to manage the computational cost, including a rejection-sampling scheme that reuses samples across boosting rounds. Empirically, NRGBoost achieves discriminative performance close to gradient-boosted decision trees on real datasets and delivers competitive samples compared to neural generative models, while enabling principled marginalization for missing data and flexible conditional sampling. The work highlights the viability of tree-based density modeling with energy-based boosting and discusses practical trade-offs in training, sampling efficiency, and applicability to density-based inference tasks on tabular data.

Abstract

Despite the rise to dominance of deep learning in unstructured data domains, tree-based methods such as Random Forests (RF) and Gradient Boosted Decision Trees (GBDT) are still the workhorses for handling discriminative tasks on tabular data. We explore generative extensions of these popular algorithms with a focus on explicitly modeling the data density (up to a normalization constant), thus enabling other applications besides sampling. As our main contribution we propose an energy-based generative boosting algorithm that is analogous to the second-order boosting implemented in popular libraries like XGBoost. We show that, despite producing a generative model capable of handling inference tasks over any input variable, our proposed algorithm can achieve similar discriminative performance to GBDT on a number of real world tabular datasets, outperforming alternative generative approaches. At the same time, we show that it is also competitive with neural-network-based models for sampling. Code is available at https://github.com/ajoo/nrgboost.
Paper Structure (58 sections, 46 equations, 6 figures, 14 tables, 1 algorithm)

This paper contains 58 sections, 46 equations, 6 figures, 14 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of an NRGBoost training iteration: 1) Draw new samples from an ensemble of trees representing a current energy function, $f$, and add them to the sample pool. 2) Fit a new weak learner (tree) representing a model update, $\delta f$. 3) Update the model. 4) Use rejection sampling to discard samples from the sample pool that conform poorly to the new model.
  • Figure 2: Density learned by NRGBoost at different boosting iterations (1, 3, 10 and 100), starting from a uniform distribution. The data distribution is depicted on the right (see Appendix \ref{['app:datasets']} for details). Weak learners are piecewise constant functions given by binary trees with 16 leaves.
  • Figure 3: Joint histogram for the latitude and longitude for the California Housing dataset.
  • Figure 4: Downsampled MNIST samples generated by the best generative models on this dataset. Despite being a simple dataset that would pose no challenges to image models, it is hard for tabular generative models due to the high dimensionality and complex structure of correlations between features. We find NRGBoost to be the only tabular model that is able to generate passable samples.
  • Figure 5: Downsampled MNIST samples generated by Gibbs sampling from an NRGBoost model. Each row corresponds to an independent chain initialized with a sample from the initial model $f_0$ (first column). Each column represents a consecutive sample from the chain.
  • ...and 1 more figures