Baseline Results for Selected Nonlinear System Identification Benchmarks
Max D. Champneys, Gerben I. Beintema, Roland Tóth, Maarten Schoukens, Timothy J. Rogers
TL;DR
The study addresses the need for objective baseline comparisons in nonlinear system identification (NLSI) by evaluating ten baseline methods spanning LTI State-Space, AR/ARX/NARX variants, GP/NARX, and recurrent networks on five public benchmarks (Silverbox, EMPS, Wiener–Hammerstein, cascaded tanks, and CED) under simulation. Baselines are tested on unseen data using RMSE as the primary metric, with careful data handling (training/validation/test splits) and cross-validated lag or order settings; results are contrasted with selected state-of-the-art (SOTA) methods. Findings indicate that nonlinear baselines often outperform linear counterparts, though SOTA methods can still dominate in certain benchmarks (notably pNARX and GP NARX in some cases), while EMPS remains particularly challenging. The work emphasizes the necessity of reporting baselines to enable fair comparisons and proposes baselines as a practical starting point for future NLSI developments, especially to balance physical insight and data-driven learning.
Abstract
Nonlinear system identification remains an important open challenge across research and academia. Large numbers of novel approaches are seen published each year, each presenting improvements or extensions to existing methods. It is natural, therefore, to consider how one might choose between these competing models. Benchmark datasets provide one clear way to approach this question. However, to make meaningful inference based on benchmark performance it is important to understand how well a new method performs comparatively to results available with well-established methods. This paper presents a set of ten baseline techniques and their relative performances on five popular benchmarks. The aim of this contribution is to stimulate thought and discussion regarding objective comparison of identification methodologies.
