Degrees-of-freedom penalized piecewise regression
Stefan Volz, Martin Storath, Andreas Weinmann
TL;DR
The paper introduces degrees-of-freedom penalized piecewise regression (DofPPR), a framework that penalizes the sum of per-segment degrees of freedom rather than simply counting segments, enabling heterogeneous segment models (e.g., mixed-degree polynomials). It establishes almost-sure uniqueness of the discrete minimizer in least-squares settings (excluding interpolating parts) and develops a fast algorithm to compute the entire regularization path with exact hyperparameter selection via rolling cross-validation and the one-standard-error rule. The authors provide a complete implementation (Rust core with Python bindings) and demonstrate improved performance on simulated data and real TCPD changepoint benchmark tasks, including state-of-the-art results under constrained DOF budgets. The approach supports optional domain knowledge and yields an interpretable, automatable model selection workflow suitable for exploratory data analysis and changepoint detection, with theoretical guarantees and scalable computation. The work contributes to the changepoint and piecewise regression literature by enabling flexible, data-adaptive modeling across segments while offering practical, exact hyperparameter tuning and robust performance metrics.
Abstract
Many popular piecewise regression models rely on minimizing a cost function on the model fit with a linear penalty on the number of segments. However, this penalty does not take into account varying complexities of the model functions on the segments potentially leading to overfitting when models with varying complexities, such as polynomials of different degrees, are used. In this work, we enhance on this approach by instead using a penalty on the sum of the degrees of freedom over all segments, called degrees-of-freedom penalized piecewise regression (DofPPR). We show that the solutions of the resulting minimization problem are unique for almost all input data in a least squares setting. We develop a fast algorithm which does not only compute a minimizer but also determines an optimal hyperparameter -- in the sense of rolling cross validation with the one standard error rule -- exactly. This eliminates manual hyperparameter selection. Our method supports optional user parameters for incorporating domain knowledge. We provide an open-source Python/Rust code for the piecewise polynomial least squares case which can be extended to further models. We demonstrate the practical utility through a simulation study and by applications to real data. A constrained variant of the proposed method gives state-of-the-art results in the Turing benchmark for unsupervised changepoint detection.
