Adaptive Error-Bounded Hierarchical Matrices for Efficient Neural Network Compression
John Mango, Ronald Katende
TL;DR
This work tackles the computational and memory bottlenecks of Physics-Informed Neural Networks (PINNs) caused by large dense weight matrices. It introduces a dynamic, error-bounded hierarchical matrix ($H$-matrix) compression that adaptively refines block structure while preserving Neural Tangent Kernel ($NTK$) properties, ensuring stable and efficient training. The approach combines error-estimated adaptive refinement, convergence acceleration, and a block-rank regularization mechanism, demonstrating superior performance over traditional compression techniques (e.g., Singular Value Decomposition, pruning, and quantization) and enabling real-time inference. The results indicate improved convergence, robustness to perturbations, and scalable deployment of PINNs in complex scientific and engineering PDE problems.
Abstract
This paper introduces a dynamic, error-bounded hierarchical matrix (H-matrix) compression method tailored for Physics-Informed Neural Networks (PINNs). The proposed approach reduces the computational complexity and memory demands of large-scale physics-based models while preserving the essential properties of the Neural Tangent Kernel (NTK). By adaptively refining hierarchical matrix approximations based on local error estimates, our method ensures efficient training and robust model performance. Empirical results demonstrate that this technique outperforms traditional compression methods, such as Singular Value Decomposition (SVD), pruning, and quantization, by maintaining high accuracy and improving generalization capabilities. Additionally, the dynamic H-matrix method enhances inference speed, making it suitable for real-time applications. This approach offers a scalable and efficient solution for deploying PINNs in complex scientific and engineering domains, bridging the gap between computational feasibility and real-world applicability.
