A Lightweight Gradient-based Causal Discovery Framework with Applications to Complex Industrial Processes
Meiliang Liu, Huiwen Dong, Xiaoxiao Yang, Yunfang Xu, Zijin Li, Zhengye Si, Xinyue Yang, Zhiwen Zhao
TL;DR
This work tackles scalable causal discovery in multivariate time series by moving away from component-wise neural architectures. It introduces Gradient-based Causal Discovery (GCD), which uses a single MLP and applies L1 regularization on input-output gradients to infer Granger causality, complemented by a Phase-randomization Surrogate Statistical Test (PSST) for significance. Across benchmarks (Lorenz-96, DREAM4, CausalTime) and industrial datasets (Tennessee-Eastman, Ultra-processed Food, Debutanizer), GCD delivers superior accuracy while markedly reducing computational overhead. The approach offers a flexible, scalable tool for reconstructing causal structures in complex industrial systems and can be integrated with diverse forecasting backbones.
Abstract
With the advancement of deep learning technologies, various neural network-based Granger causality models have been proposed. Although these models have demonstrated notable improvements, several limitations remain. Most existing approaches adopt the component-wise architecture, necessitating the construction of a separate model for each time series, which results in substantial computational costs. In addition, imposing the sparsity-inducing penalty on the first-layer weights of the neural network to extract causal relationships weakens the model's ability to capture complex interactions. To address these limitations, we propose Gradient Regularization-based Neural Granger Causality (GRNGC), which requires only one time series prediction model and applies $L_{1}$ regularization to the gradient between model's input and output to infer Granger causality. Moreover, GRNGC is not tied to a specific time series forecasting model and can be implemented with diverse architectures such as KAN, MLP, and LSTM, offering enhanced flexibility. Numerical simulations on DREAM, Lorenz-96, fMRI BOLD, and CausalTime show that GRNGC outperforms existing baselines and significantly reduces computational overhead. Meanwhile, experiments on real-world DNA, Yeast, HeLa, and bladder urothelial carcinoma datasets further validate the model's effectiveness in reconstructing gene regulatory networks.
