Projected random forests and conformal prediction of circular data
Paulo C. Marques F., Rinaldo Artes, Helton Graziadei
TL;DR
The paper develops conformal prediction for regression with circular responses under exchangeable data by introducing a circular conformity score based on angular distance and a projection method that turns linear-response models into circular predictors. A key contribution is applying this projection to random forests and leveraging out-of-bag conformal prediction to avoid a separate calibration sample, while still achieving practical coverage. Empirical results on synthetic and wind-direction datasets show that projected random forests produce shorter prediction-arc intervals (higher efficiency) than split conformal sets from a projected normal linear model and a circular forest, with empirical coverage close to the nominal level. The work is complemented by open-source software to reproduce the analyses and results.
Abstract
We apply split conformal prediction techniques to regression problems with circular responses by introducing a suitable conformity score, leading to prediction sets with adaptive arc length and finite-sample coverage guarantees for any circular predictive model under exchangeable data. Leveraging the high performance of existing predictive models designed for linear responses, we analyze a general projection procedure that converts any linear response regression model into one suitable for circular responses. When random forests serve as basis models in this projection procedure, we harness the out-of-bag dynamics to eliminate the necessity for a separate calibration sample in the construction of prediction sets. For synthetic and real datasets the resulting projected random forests model produces more efficient out-of-bag conformal prediction sets, with shorter median arc length, when compared to the split conformal prediction sets generated by two existing alternative models.
