Table of Contents
Fetching ...

On Quantile Regression Forests for Modelling Mixed-Frequency and Longitudinal Data

Mila Andreani

TL;DR

The paper tackles modeling the entire conditional distribution in datasets with mixed-frequency and longitudinal structure by introducing two novel quantum regression frameworks: MIDAS-QRF and FM-QRF. MIDAS-QRF embeds the MIDAS approach into Quantile Regression Forests to incorporate low-frequency predictors non-parametrically, while Dynamic MIDAS-QRF adds autoregressive quantile dynamics to better capture time-varying tails. FM-QRF extends QR Forests to longitudinal data by using a finite-mixture random-effects structure estimated via an EM algorithm with nonparametric maximum likelihood, enabling flexible, non-linear quantile estimation without strict distributional assumptions. Empirical applications demonstrate competitive tail-risk forecasting in finance (VaR for energy commodities), climate-economics (GDP growth-at-risk across countries), and public health (UK children's SDQ scores), with both methods revealing substantial tail-heterogeneity and the value of non-parametric approaches over traditional linear QR models. The methodological contributions thus provide robust, interpretable tools for tail-risk assessment across diverse complex data settings, with practical implications for risk management and policy analysis.

Abstract

The aim of this thesis is to extend the applications of the Quantile Regression Forest (QRF) algorithm to handle mixed-frequency and longitudinal data. To this end, standard statistical approaches have been exploited to build two novel algorithms: the Mixed- Frequency Quantile Regression Forest (MIDAS-QRF) and the Finite Mixture Quantile Regression Forest (FM-QRF). The MIDAS-QRF combines the flexibility of QRF with the Mixed Data Sampling (MIDAS) approach, enabling non-parametric quantile estimation with variables observed at different frequencies. FM-QRF, on the other hand, extends random effects machine learning algorithms to a QR framework, allowing for conditional quantile estimation in a longitudinal data setting. The contributions of this dissertation lie both methodologically and empirically. Methodologically, the MIDAS-QRF and the FM-QRF represent two novel approaches for handling mixed-frequency and longitudinal data in QR machine learning framework. Empirically, the application of the proposed models in financial risk management and climate-change impact evaluation demonstrates their validity as accurate and flexible models to be applied in complex empirical settings.

On Quantile Regression Forests for Modelling Mixed-Frequency and Longitudinal Data

TL;DR

The paper tackles modeling the entire conditional distribution in datasets with mixed-frequency and longitudinal structure by introducing two novel quantum regression frameworks: MIDAS-QRF and FM-QRF. MIDAS-QRF embeds the MIDAS approach into Quantile Regression Forests to incorporate low-frequency predictors non-parametrically, while Dynamic MIDAS-QRF adds autoregressive quantile dynamics to better capture time-varying tails. FM-QRF extends QR Forests to longitudinal data by using a finite-mixture random-effects structure estimated via an EM algorithm with nonparametric maximum likelihood, enabling flexible, non-linear quantile estimation without strict distributional assumptions. Empirical applications demonstrate competitive tail-risk forecasting in finance (VaR for energy commodities), climate-economics (GDP growth-at-risk across countries), and public health (UK children's SDQ scores), with both methods revealing substantial tail-heterogeneity and the value of non-parametric approaches over traditional linear QR models. The methodological contributions thus provide robust, interpretable tools for tail-risk assessment across diverse complex data settings, with practical implications for risk management and policy analysis.

Abstract

The aim of this thesis is to extend the applications of the Quantile Regression Forest (QRF) algorithm to handle mixed-frequency and longitudinal data. To this end, standard statistical approaches have been exploited to build two novel algorithms: the Mixed- Frequency Quantile Regression Forest (MIDAS-QRF) and the Finite Mixture Quantile Regression Forest (FM-QRF). The MIDAS-QRF combines the flexibility of QRF with the Mixed Data Sampling (MIDAS) approach, enabling non-parametric quantile estimation with variables observed at different frequencies. FM-QRF, on the other hand, extends random effects machine learning algorithms to a QR framework, allowing for conditional quantile estimation in a longitudinal data setting. The contributions of this dissertation lie both methodologically and empirically. Methodologically, the MIDAS-QRF and the FM-QRF represent two novel approaches for handling mixed-frequency and longitudinal data in QR machine learning framework. Empirically, the application of the proposed models in financial risk management and climate-change impact evaluation demonstrates their validity as accurate and flexible models to be applied in complex empirical settings.

Paper Structure

This paper contains 35 sections, 34 equations, 24 figures, 19 tables, 1 algorithm.

Figures (24)

  • Figure 1: Heating Oil (black line) index out-of-sample predictions at quantile levels $\tau= 0.01, 0.025, 0.05$. The top panel and the bottom panel show the predictions obtained with the dynamic MIDAS-QRF model, respectively.
  • Figure 2: Variable importance for the static MIDAS-QRF at $\tau=0.01$ for the Heating Oil index
  • Figure 3: Variable importance for the static MIDAS-QRF at $\tau=0.025$ for the Heating Oil index
  • Figure 4: Variable importance for the static MIDAS-QRF at $\tau=0.05$ for the Heating Oil index
  • Figure 5: Average variable importance across quantiles $I_j$ for each covariate.
  • ...and 19 more figures