NRGBoost: Energy-Based Generative Boosted Trees
João Bravo
TL;DR
NRGBoost delivers an energy-based generative boosting approach for tabular data by learning a density q_f(x) ∝ exp(f(x)) with normalization Z[f], and optimizing a quadratic approximation to the log-likelihood at each boosting step. It uses piecewise-constant trees as weak learners and introduces amortized Gibbs sampling to manage the computational cost, including a rejection-sampling scheme that reuses samples across boosting rounds. Empirically, NRGBoost achieves discriminative performance close to gradient-boosted decision trees on real datasets and delivers competitive samples compared to neural generative models, while enabling principled marginalization for missing data and flexible conditional sampling. The work highlights the viability of tree-based density modeling with energy-based boosting and discusses practical trade-offs in training, sampling efficiency, and applicability to density-based inference tasks on tabular data.
Abstract
Despite the rise to dominance of deep learning in unstructured data domains, tree-based methods such as Random Forests (RF) and Gradient Boosted Decision Trees (GBDT) are still the workhorses for handling discriminative tasks on tabular data. We explore generative extensions of these popular algorithms with a focus on explicitly modeling the data density (up to a normalization constant), thus enabling other applications besides sampling. As our main contribution we propose an energy-based generative boosting algorithm that is analogous to the second-order boosting implemented in popular libraries like XGBoost. We show that, despite producing a generative model capable of handling inference tasks over any input variable, our proposed algorithm can achieve similar discriminative performance to GBDT on a number of real world tabular datasets, outperforming alternative generative approaches. At the same time, we show that it is also competitive with neural-network-based models for sampling. Code is available at https://github.com/ajoo/nrgboost.
