Table of Contents
Fetching ...

MorphBoost: Self-Organizing Universal Gradient Boosting with Adaptive Tree Morphing

Boris Kriuk

TL;DR

MorphBoost tackles the rigidity of static tree structures in gradient boosting by introducing dynamic, self-organizing tree morphing guided by gradient statistics and information-theoretic criteria. The method combines automatic problem fingerprinting, morphing split functions, vectorized batch prediction, and interaction-aware feature importance to adapt to dataset complexity and problem type (binary, multiclass, regression). On 10 diverse datasets, MorphBoost achieves a 0.84% average accuracy gain over XGBoost, with the lowest variance and robust performance across difficulty levels, especially on high-dimensional problems. The results demonstrate practical benefits in both predictive performance and computational efficiency, with potential impact for scalable, adaptive ensembles in heterogeneous data environments.

Abstract

Traditional gradient boosting algorithms employ static tree structures with fixed splitting criteria that remain unchanged throughout training, limiting their ability to adapt to evolving gradient distributions and problem-specific characteristics across different learning stages. This work introduces MorphBoost, a new gradient boosting framework featuring self-organizing tree structures that dynamically morph their splitting behavior during training. The algorithm implements adaptive split functions that evolve based on accumulated gradient statistics and iteration-dependent learning pressures, enabling automatic adjustment to problem complexity. Key innovations include: (1) morphing split criterion combining gradient-based scores with information-theoretic metrics weighted by training progress; (2) automatic problem fingerprinting for intelligent parameter configuration across binary/multiclass/regression tasks; (3) vectorized tree prediction achieving significant computational speedups; (4) interaction-aware feature importance detecting multiplicative relationships; and (5) fast-mode optimization balancing speed and accuracy. Comprehensive benchmarking across 10 diverse datasets against competitive models (XGBoost, LightGBM, GradientBoosting, HistGradientBoosting, ensemble methods) demonstrates that MorphBoost achieves state-of-the-art performance, outperforming XGBoost by 0.84% on average. MorphBoost secured the overall winner position with 4/10 dataset wins (40% win rate) and 6/30 top-3 finishes (20%), while maintaining the lowest variance (σ=0.0948) and highest minimum accuracy across all models, revealing superior consistency and robustness. Performance analysis across difficulty levels shows competitive results on easy datasets while achieving notable improvements on advanced problems due to higher adaptation levels.

MorphBoost: Self-Organizing Universal Gradient Boosting with Adaptive Tree Morphing

TL;DR

MorphBoost tackles the rigidity of static tree structures in gradient boosting by introducing dynamic, self-organizing tree morphing guided by gradient statistics and information-theoretic criteria. The method combines automatic problem fingerprinting, morphing split functions, vectorized batch prediction, and interaction-aware feature importance to adapt to dataset complexity and problem type (binary, multiclass, regression). On 10 diverse datasets, MorphBoost achieves a 0.84% average accuracy gain over XGBoost, with the lowest variance and robust performance across difficulty levels, especially on high-dimensional problems. The results demonstrate practical benefits in both predictive performance and computational efficiency, with potential impact for scalable, adaptive ensembles in heterogeneous data environments.

Abstract

Traditional gradient boosting algorithms employ static tree structures with fixed splitting criteria that remain unchanged throughout training, limiting their ability to adapt to evolving gradient distributions and problem-specific characteristics across different learning stages. This work introduces MorphBoost, a new gradient boosting framework featuring self-organizing tree structures that dynamically morph their splitting behavior during training. The algorithm implements adaptive split functions that evolve based on accumulated gradient statistics and iteration-dependent learning pressures, enabling automatic adjustment to problem complexity. Key innovations include: (1) morphing split criterion combining gradient-based scores with information-theoretic metrics weighted by training progress; (2) automatic problem fingerprinting for intelligent parameter configuration across binary/multiclass/regression tasks; (3) vectorized tree prediction achieving significant computational speedups; (4) interaction-aware feature importance detecting multiplicative relationships; and (5) fast-mode optimization balancing speed and accuracy. Comprehensive benchmarking across 10 diverse datasets against competitive models (XGBoost, LightGBM, GradientBoosting, HistGradientBoosting, ensemble methods) demonstrates that MorphBoost achieves state-of-the-art performance, outperforming XGBoost by 0.84% on average. MorphBoost secured the overall winner position with 4/10 dataset wins (40% win rate) and 6/30 top-3 finishes (20%), while maintaining the lowest variance (σ=0.0948) and highest minimum accuracy across all models, revealing superior consistency and robustness. Performance analysis across difficulty levels shows competitive results on easy datasets while achieving notable improvements on advanced problems due to higher adaptation levels.

Paper Structure

This paper contains 16 sections, 12 equations, 5 figures.

Figures (5)

  • Figure 1: Overview of MorphBoost Architecture.
  • Figure 2: Accuracy distribution across all 10 benchmark datasets.
  • Figure 3: Model-wise performance consistency analysis.
  • Figure 4: Accuracy performance radar of Top 5 models for 10 datasets.
  • Figure 5: Accuracy heatmap showing performance of all 10 models across 10 datasets.