Conformal Prediction for Hierarchical Data
Guillaume Principato, Gilles Stoltz, Yvenn Amara-Ouali, Yannig Goude, Bachir Hamrouche, Jean-Michel Poggi
TL;DR
This paper develops a framework that combines conformal prediction with forecast reconciliation to construct valid, efficient prediction regions for hierarchical multivariate data. It introduces joint-coverage ellipsoidal SCP and a hierarchical, component-wise SCP with a projection-based reconciliation that enforces coherence across levels. The authors prove that reconciliation can yield smaller prediction regions than non-reconciled methods under both joint and component-wise guarantees, and they provide oracle and practical implementations (including minimum-trace projections) with efficiency guarantees. Empirical results on synthetic hierarchies confirm substantial efficiency gains, particularly for weighted component-wise objectives, and highlight robustness considerations for large-scale hierarchies. The work lays a theoretical and practical foundation for hierarchy-aware, distribution-free uncertainty quantification in forecasting settings, with potential extensions to hierarchical time series and adaptive conformal inference.
Abstract
We consider conformal prediction for multivariate data and focus on hierarchical data, where some components are linear combinations of others. Intuitively, the hierarchical structure can be leveraged to reduce the size of prediction regions for the same coverage level. We implement this intuition by including a projection step (also called a reconciliation step) in the split conformal prediction [SCP] procedure, and prove that the resulting prediction regions are indeed globally smaller. We do so both under the classic objective of joint coverage and under a new and challenging task: component-wise coverage, for which efficiency results are more difficult to obtain. The associated strategies and their analyses are based both on the literature of SCP and of forecast reconciliation, which we connect. We also illustrate the theoretical findings, for different scales of hierarchies on simulated data.
