Table of Contents
Fetching ...

Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring

Ke Chen, Dandan Jiang

TL;DR

This work tackles the costly computation of kernel-based nonlinear process monitoring by introducing random Bernoulli feature mappings that sparsify nonlinear feature generation. By projecting data into a sparse, bootstrap-informed feature space and applying PCA in that space (RB-PCA), the method achieves near-Gaussian kernel behavior with significantly reduced complexity, including a spectral-norm convergence bound for the kernel approximation. The authors develop four monitoring variants for static, dynamic, two-dimensional dynamic, and time-varying processes, and demonstrate that RB-PCA-based methods deliver substantial speedups (often orders of magnitude) with negligible performance loss on both synthetic and real data (e.g., Tennessee Eastman Process and Server Machine Dataset). Overall, the approach provides scalable, real-time capable nonlinear process monitoring, with broader applicability to other nonlinear problems and a public codebase for reproducibility.

Abstract

The process generates substantial amounts of data with highly complex structures, leading to the development of numerous nonlinear statistical methods. However, most of these methods rely on computations involving large-scale dense kernel matrices. This dependence poses significant challenges in meeting the high computational demands and real-time responsiveness required by online monitoring systems. To alleviate the computational burden of dense large-scale matrix multiplication, we incorporate the bootstrap sampling concept into random feature mapping and propose a novel random Bernoulli principal component analysis method to efficiently capture nonlinear patterns in the process. We derive a convergence bound for the kernel matrix approximation constructed using random Bernoulli features, ensuring theoretical robustness. Subsequently, we design four fast process monitoring methods based on random Bernoulli principal component analysis to extend its nonlinear capabilities for handling diverse fault scenarios. Finally, numerical experiments and real-world data analyses are conducted to evaluate the performance of the proposed methods. Results demonstrate that the proposed methods offer excellent scalability and reduced computational complexity, achieving substantial cost savings with minimal performance loss compared to traditional kernel-based approaches.

Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring

TL;DR

This work tackles the costly computation of kernel-based nonlinear process monitoring by introducing random Bernoulli feature mappings that sparsify nonlinear feature generation. By projecting data into a sparse, bootstrap-informed feature space and applying PCA in that space (RB-PCA), the method achieves near-Gaussian kernel behavior with significantly reduced complexity, including a spectral-norm convergence bound for the kernel approximation. The authors develop four monitoring variants for static, dynamic, two-dimensional dynamic, and time-varying processes, and demonstrate that RB-PCA-based methods deliver substantial speedups (often orders of magnitude) with negligible performance loss on both synthetic and real data (e.g., Tennessee Eastman Process and Server Machine Dataset). Overall, the approach provides scalable, real-time capable nonlinear process monitoring, with broader applicability to other nonlinear problems and a public codebase for reproducibility.

Abstract

The process generates substantial amounts of data with highly complex structures, leading to the development of numerous nonlinear statistical methods. However, most of these methods rely on computations involving large-scale dense kernel matrices. This dependence poses significant challenges in meeting the high computational demands and real-time responsiveness required by online monitoring systems. To alleviate the computational burden of dense large-scale matrix multiplication, we incorporate the bootstrap sampling concept into random feature mapping and propose a novel random Bernoulli principal component analysis method to efficiently capture nonlinear patterns in the process. We derive a convergence bound for the kernel matrix approximation constructed using random Bernoulli features, ensuring theoretical robustness. Subsequently, we design four fast process monitoring methods based on random Bernoulli principal component analysis to extend its nonlinear capabilities for handling diverse fault scenarios. Finally, numerical experiments and real-world data analyses are conducted to evaluate the performance of the proposed methods. Results demonstrate that the proposed methods offer excellent scalability and reduced computational complexity, achieving substantial cost savings with minimal performance loss compared to traditional kernel-based approaches.

Paper Structure

This paper contains 16 sections, 2 theorems, 17 equations, 3 figures, 4 tables, 2 algorithms.

Key Result

Lemma 1

Consider an independent sequence $\bm{Y}_1,\dots,\bm{Y}_m$ of random matrices. Suppose that $\bm{Y}_k\in \mathbb{R}^{d\times d}\ (k=1,\dots,m)$ is a symmetric matrix satisfying that $E\left(\bm{Y}_k\right)=\bm{\epsilon}_k$, and existing a positive real number $R$ such that $\Vert \lvert\bm{Y}_k\rver

Figures (3)

  • Figure 1: The accuracy and the computational complexity under varying numbers of random Bernoulli features $m$. FDR, fault detection rate; FAR, false alarm rate. DRBPCA, dynamic random Bernoulli PCA; 2D-RBPCA, two-dimensional random Bernoulli PCA; DKPCA, dynamic kernel PCA.
  • Figure 2: The accuracy and the computational complexity under varying parameters of Bernoulli distribution $p$. FDR, fault detection rate; FAR, false alarm rate. DRBPCA, dynamic random Bernoulli PCA; 2D-RBPCA, two-dimensional random Bernoulli PCA; DKPCA, dynamic kernel PCA.
  • Figure 3: The accuracy and the computational complexity under varying time lags $l$. FDR, fault detection rate; FAR, false alarm rate. DRBPCA, dynamic random Bernoulli PCA; 2D-RBPCA, two-dimensional random Bernoulli PCA; DKPCA, dynamic kernel PCA.

Theorems & Definitions (3)

  • Definition 1
  • Lemma 1: Approximate matrix Bernstein's inequality
  • Theorem 1