Table of Contents
Fetching ...

A Lightweight Gradient-based Causal Discovery Framework with Applications to Complex Industrial Processes

Meiliang Liu, Huiwen Dong, Xiaoxiao Yang, Yunfang Xu, Zijin Li, Zhengye Si, Xinyue Yang, Zhiwen Zhao

TL;DR

This work tackles scalable causal discovery in multivariate time series by moving away from component-wise neural architectures. It introduces Gradient-based Causal Discovery (GCD), which uses a single MLP and applies L1 regularization on input-output gradients to infer Granger causality, complemented by a Phase-randomization Surrogate Statistical Test (PSST) for significance. Across benchmarks (Lorenz-96, DREAM4, CausalTime) and industrial datasets (Tennessee-Eastman, Ultra-processed Food, Debutanizer), GCD delivers superior accuracy while markedly reducing computational overhead. The approach offers a flexible, scalable tool for reconstructing causal structures in complex industrial systems and can be integrated with diverse forecasting backbones.

Abstract

With the advancement of deep learning technologies, various neural network-based Granger causality models have been proposed. Although these models have demonstrated notable improvements, several limitations remain. Most existing approaches adopt the component-wise architecture, necessitating the construction of a separate model for each time series, which results in substantial computational costs. In addition, imposing the sparsity-inducing penalty on the first-layer weights of the neural network to extract causal relationships weakens the model's ability to capture complex interactions. To address these limitations, we propose Gradient Regularization-based Neural Granger Causality (GRNGC), which requires only one time series prediction model and applies $L_{1}$ regularization to the gradient between model's input and output to infer Granger causality. Moreover, GRNGC is not tied to a specific time series forecasting model and can be implemented with diverse architectures such as KAN, MLP, and LSTM, offering enhanced flexibility. Numerical simulations on DREAM, Lorenz-96, fMRI BOLD, and CausalTime show that GRNGC outperforms existing baselines and significantly reduces computational overhead. Meanwhile, experiments on real-world DNA, Yeast, HeLa, and bladder urothelial carcinoma datasets further validate the model's effectiveness in reconstructing gene regulatory networks.

A Lightweight Gradient-based Causal Discovery Framework with Applications to Complex Industrial Processes

TL;DR

This work tackles scalable causal discovery in multivariate time series by moving away from component-wise neural architectures. It introduces Gradient-based Causal Discovery (GCD), which uses a single MLP and applies L1 regularization on input-output gradients to infer Granger causality, complemented by a Phase-randomization Surrogate Statistical Test (PSST) for significance. Across benchmarks (Lorenz-96, DREAM4, CausalTime) and industrial datasets (Tennessee-Eastman, Ultra-processed Food, Debutanizer), GCD delivers superior accuracy while markedly reducing computational overhead. The approach offers a flexible, scalable tool for reconstructing causal structures in complex industrial systems and can be integrated with diverse forecasting backbones.

Abstract

With the advancement of deep learning technologies, various neural network-based Granger causality models have been proposed. Although these models have demonstrated notable improvements, several limitations remain. Most existing approaches adopt the component-wise architecture, necessitating the construction of a separate model for each time series, which results in substantial computational costs. In addition, imposing the sparsity-inducing penalty on the first-layer weights of the neural network to extract causal relationships weakens the model's ability to capture complex interactions. To address these limitations, we propose Gradient Regularization-based Neural Granger Causality (GRNGC), which requires only one time series prediction model and applies regularization to the gradient between model's input and output to infer Granger causality. Moreover, GRNGC is not tied to a specific time series forecasting model and can be implemented with diverse architectures such as KAN, MLP, and LSTM, offering enhanced flexibility. Numerical simulations on DREAM, Lorenz-96, fMRI BOLD, and CausalTime show that GRNGC outperforms existing baselines and significantly reduces computational overhead. Meanwhile, experiments on real-world DNA, Yeast, HeLa, and bladder urothelial carcinoma datasets further validate the model's effectiveness in reconstructing gene regulatory networks.

Paper Structure

This paper contains 25 sections, 19 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: (Left) The component-wise architecture proposed by Tank et al. 1tank2021neural, which required $D$ models for $D$-dimensional time series. (Right) The architecture of proposed GCD, which only required one model for $D$-dimensional time series.
  • Figure 2: The causal discovery results of each model on TE dataset. (a) Ground truth. (b) cMLP. (c) cLSTM. (d) TCDF. (e) eSRU. (f) GVAR. (g) CR-VAE. (h) CUTS+. (i) GCD.
  • Figure 3: The causal discovery results of each model on UF dataset. (a) Ground truth. (b) cMLP. (c) cLSTM. (d) TCDF. (e) eSRU. (f) GVAR. (g) CR-VAE. (h) CUTS+. (i) GCD.
  • Figure 4: The causal discovery results of each model on the Debutanizer dataset. (a) Ground truth. (b) cMLP. (c) cLSTM. (d) TCDF. (e) eSRU. (f) GVAR. (g) CR-VAE. (h) CUTS+. (i) GCD.
  • Figure 5: Hyperparameter tuning results. (a) TE. (b) UF. (c) DE.
  • ...and 1 more figures