Table of Contents
Fetching ...

Gradient Boosting Neural Networks: GrowNet

Sarkhan Badirli, Xuanqing Liu, Zhengming Xing, Avradeep Bhowmik, Khoa Doan, Sathiya S. Keerthi

TL;DR

GrowNet reframes gradient boosting by using shallow neural networks as weak learners to incrementally build deep models. It introduces a global corrective step and second-order gradient statistics to mitigate greedy fitting and improve stability across classification, regression, and ranking. Empirical results on multiple datasets show GrowNet outperforms state-of-the-art boosting methods like XGBoost and AdaNet, while offering faster training and simpler tuning than large DNNs. The work also demonstrates the value of stacked features and ablation analyses in guiding model design for diverse ML tasks.

Abstract

A novel gradient boosting framework is proposed where shallow neural networks are employed as ``weak learners''. General loss functions are considered under this unified framework with specific examples presented for classification, regression, and learning to rank. A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision tree. The proposed model rendered outperforming results against state-of-the-art boosting methods in all three tasks on multiple datasets. An ablation study is performed to shed light on the effect of each model components and model hyperparameters.

Gradient Boosting Neural Networks: GrowNet

TL;DR

GrowNet reframes gradient boosting by using shallow neural networks as weak learners to incrementally build deep models. It introduces a global corrective step and second-order gradient statistics to mitigate greedy fitting and improve stability across classification, regression, and ranking. Empirical results on multiple datasets show GrowNet outperforms state-of-the-art boosting methods like XGBoost and AdaNet, while offering faster training and simpler tuning than large DNNs. The work also demonstrates the value of stacked features and ablation analyses in guiding model design for diverse ML tasks.

Abstract

A novel gradient boosting framework is proposed where shallow neural networks are employed as ``weak learners''. General loss functions are considered under this unified framework with specific examples presented for classification, regression, and learning to rank. A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision tree. The proposed model rendered outperforming results against state-of-the-art boosting methods in all three tasks on multiple datasets. An ablation study is performed to shed light on the effect of each model components and model hyperparameters.

Paper Structure

This paper contains 24 sections, 8 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: GrowNet architecture. After the first weak learner, each predictor is trained on combined features from original input and penultimate layer features from previous weak learner. The final output is the weighted sum of outputs from all predictors, $\sum_{k=1}^{k=K}\alpha_k f_k(x)$. Here Model K means weak learner K.
  • Figure 2: Classification training losses
  • Figure 3: Boosting rate evolution
  • Figure 4: Effect of # neurons on classification performance
  • Figure 5: Training loss visualization for the learning to rank task on MSLR dataset. We used pairwise loss.
  • ...and 2 more figures