Gradient Boosting Neural Networks: GrowNet
Sarkhan Badirli, Xuanqing Liu, Zhengming Xing, Avradeep Bhowmik, Khoa Doan, Sathiya S. Keerthi
TL;DR
GrowNet reframes gradient boosting by using shallow neural networks as weak learners to incrementally build deep models. It introduces a global corrective step and second-order gradient statistics to mitigate greedy fitting and improve stability across classification, regression, and ranking. Empirical results on multiple datasets show GrowNet outperforms state-of-the-art boosting methods like XGBoost and AdaNet, while offering faster training and simpler tuning than large DNNs. The work also demonstrates the value of stacked features and ablation analyses in guiding model design for diverse ML tasks.
Abstract
A novel gradient boosting framework is proposed where shallow neural networks are employed as ``weak learners''. General loss functions are considered under this unified framework with specific examples presented for classification, regression, and learning to rank. A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision tree. The proposed model rendered outperforming results against state-of-the-art boosting methods in all three tasks on multiple datasets. An ablation study is performed to shed light on the effect of each model components and model hyperparameters.
