Table of Contents
Fetching ...

Numerically Efficient and Stable Algorithms for Kernel-Based Regularized System Identification Using Givens-Vector Representation

Zhuohua Shen, Junpeng Zhang, Martin S. Andersen, Tianshi Chen

TL;DR

The paper tackles numerical instability in kernel-based regularized system identification (KRSysId) that arises when using generator-based semiseparable representations (GR). It introduces a numerically stable Givens-vector representation (GvR) for widely used kernel matrices, derives GvR forms for kernel and output kernels, and develops $ ext{O}(Np^2)$-cost algorithms for key tasks including matrix-vector products, Cholesky factorization, and trace calculations. Through Monte Carlo simulations, the authors show that GvR-based methods achieve greater numerical stability and accuracy than GR-based approaches without sacrificing efficiency, thereby enabling reliable kernel learning and hyper-parameter estimation. The work provides both theoretical constructions and practical algorithms, together with extensive experiments, to support broader adoption of GvR in KRSysId. Overall, the GvR framework improves stability and efficiency for kernel-based system identification, preserving the favorable scaling while mitigating instability issues inherent to GR-based methods.

Abstract

Numerically efficient and stable algorithms are essential for kernel-based regularized system identification. The state of art algorithms exploit the semiseparable structure of the kernel and are based on the generator representation of the kernel matrix. However, as will be shown from both the theory and the practice, the algorithms based on the generator representation are sometimes numerically unstable, which limits their application in practice. This paper aims to address this issue by deriving and exploiting an alternative Givens-vector representation of some widely used kernel matrices. Based on the Givens-vector representation, we derive algorithms that yield more accurate results than existing algorithms without sacrificing efficiency. We demonstrate their usage for the kernel-based regularized system identification. Monte Carlo simulations show that the proposed algorithms admit the same order of computational complexity as the state-of-the-art ones based on generator representation, but without issues with numerical stability.

Numerically Efficient and Stable Algorithms for Kernel-Based Regularized System Identification Using Givens-Vector Representation

TL;DR

The paper tackles numerical instability in kernel-based regularized system identification (KRSysId) that arises when using generator-based semiseparable representations (GR). It introduces a numerically stable Givens-vector representation (GvR) for widely used kernel matrices, derives GvR forms for kernel and output kernels, and develops -cost algorithms for key tasks including matrix-vector products, Cholesky factorization, and trace calculations. Through Monte Carlo simulations, the authors show that GvR-based methods achieve greater numerical stability and accuracy than GR-based approaches without sacrificing efficiency, thereby enabling reliable kernel learning and hyper-parameter estimation. The work provides both theoretical constructions and practical algorithms, together with extensive experiments, to support broader adoption of GvR in KRSysId. Overall, the GvR framework improves stability and efficiency for kernel-based system identification, preserving the favorable scaling while mitigating instability issues inherent to GR-based methods.

Abstract

Numerically efficient and stable algorithms are essential for kernel-based regularized system identification. The state of art algorithms exploit the semiseparable structure of the kernel and are based on the generator representation of the kernel matrix. However, as will be shown from both the theory and the practice, the algorithms based on the generator representation are sometimes numerically unstable, which limits their application in practice. This paper aims to address this issue by deriving and exploiting an alternative Givens-vector representation of some widely used kernel matrices. Based on the Givens-vector representation, we derive algorithms that yield more accurate results than existing algorithms without sacrificing efficiency. We demonstrate their usage for the kernel-based regularized system identification. Monte Carlo simulations show that the proposed algorithms admit the same order of computational complexity as the state-of-the-art ones based on generator representation, but without issues with numerical stability.

Paper Structure

This paper contains 28 sections, 4 theorems, 114 equations, 3 figures, 2 tables, 8 algorithms.

Key Result

Proposition 3.3

The kernel matrix $K_{\boldsymbol \eta}^{\mathrm{SS}}\in\mathcal{S}_{N,2}$ with $c=1$ has GvR and the kernel matrix $K_{\boldsymbol \eta}^{\mathrm{DC}}\in\mathcal{S}_{N,1}$ with $c=1$ has GvR for $i=1,\ldots,N-1$ and $\ell=1,\ldots,N$. Letting $\lambda=\rho$ in the GvR of $K_{\eta}^{\mathrm{DC}}$ gives the GvR of the kernel matrix $K_{\boldsymbol \eta}^{\mathrm{TC}}\in\mathcal{S}_{N,1}$.

Figures (3)

  • Figure 1: The logarithms of the averaged difference norms with respect to $\lambda$ using methods $\star\in\{\mathsf{GR},\mathsf{GRs},\mathsf{GvR},\mathsf{GvRt}\}$ while fixing $(c,\rho,\gamma)=(1,0.6,10^{-4})$. In the first two columns, $\mathsf{GR}$ and $\mathsf{GRs}$ are the same. The first row uses the unit impulse input (S1) where $\mathsf{GR}$ returns NaN when $\lambda=0.7$, and the second row uses the exponential input $u(t)=e^{-0.5t}$ (S2). The experiments are repeated 80 times.
  • Figure 1: The logarithms of the averaged difference norms with respect to $\lambda$ using methods $\star\in\{\mathsf{GR},\mathsf{GRs},\mathsf{GvR},\mathsf{GvRt}\}$ while fixing $(c,\rho,\gamma)=(1,0.6,10^{-4})$ and varying $\alpha=0.5,1.0,1.5$. In the first two columns, $\mathsf{GR}$ and $\mathsf{GRs}$ are the same. The experiments are repeated 80 times.
  • Figure 2: The first column shows the distributions of the model fit difference for $\mathsf{GR}$, $\mathsf{GRs}$, $\mathsf{GvR}$, and $\mathsf{GvRt}$, while the second column shows the distributions of the optimized GCV objectives for the four methods over 80 repeated experiments. The third column displays the logarithms of the averaged computation time (in seconds) for evaluating the GCV 200 times with respect to $N$ over 10 repeats, where the simulation is run on a Mac mini with Apple M4 Pro chip with 14-core CPU and 48 GB unified memory.

Theorems & Definitions (17)

  • Example 2.1: Function estimation in RKHS
  • Definition 3.1
  • Definition 3.2: $p$-semiseparable
  • Proposition 3.3
  • Proposition 3.4
  • Proposition 4.1
  • Proof 1
  • Remark 4.2
  • Theorem 4.3: Inverse of $L$
  • Proof 2
  • ...and 7 more