Multivariate Forecasting of Bitcoin Volatility with Gradient Boosting: Deterministic, Probabilistic, and Feature Importance Perspectives
Grzegorz Dudek, Mateusz Kasprzyk, Paweł Pełka
TL;DR
This paper develops a comprehensive framework for forecasting Bitcoin realized volatility using LightGBM, addressing both deterministic and probabilistic perspectives. It introduces two quantile methods—direct pinball-loss regression and residual-simulation (QRS)—and leverages 69 predictors plus shock indicators to analyze driver importance. Empirical results show LGBM-based forecasts outperform econometric and baseline ML approaches, with QRS-LGBM delivering best probabilistic calibration and sharp prediction intervals; feature importance consistently points to lagged RV, trading volume, Google Trends, and market cap as key drivers. The work offers practical implications for risk management and trading, while highlighting computational efficiency, robustness, and avenues for future enhancements such as SHAP-based attribution and alternative probabilistic frameworks.
Abstract
This study investigates the application of the Light Gradient Boosting Machine (LGBM) model for both deterministic and probabilistic forecasting of Bitcoin realized volatility. Utilizing a comprehensive set of 69 predictors -- encompassing market, behavioral, and macroeconomic indicators -- we evaluate the performance of LGBM-based models and compare them with both econometric and machine learning baselines. For probabilistic forecasting, we explore two quantile-based approaches: direct quantile regression using the pinball loss function, and a residual simulation method that transforms point forecasts into predictive distributions. To identify the main drivers of volatility, we employ gain-based and permutation feature importance techniques, consistently highlighting the significance of trading volume, lagged volatility measures, investor attention, and market capitalization. The results demonstrate that LGBM models effectively capture the nonlinear and high-variance characteristics of cryptocurrency markets while providing interpretable insights into the underlying volatility dynamics.
