Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Abbavaram Gowtham Reddy, Saketh Bachu, Harsharaj Pathak, Benin L Godfrey, Vineeth N. Balasubramanian, Varshaneya V, Satya Narayanan Kar
TL;DR
The paper tackles the limitation that standard neural networks typically capture only direct causal effects between inputs and outputs, neglecting indirect pathways. It introduces AHCE, an ante-hoc framework that augments NNs with lateral input connections to learn and preserve direct, indirect, and total causal effects during training. The approach formalizes $ACE^{\hat{Y}}_{X_i}$, $ADCE^{\hat{Y}}_{X_i}$, and $AICE^{\hat{Y}}_{X_i}$, and employs a two-phase training regime for an augmented network $\mathcal{N}^{Ind}$, along with a second-order Taylor expansion to estimate interventional expectations and a binning-based strategy for scalable computation. Experiments on synthetic and real-world datasets show that AHCE better approximates ground-truth causal effects than baselines, while efficiency techniques enable application to high-dimensional data. Overall, AHCE provides a principled, scalable path to learning and explaining indirect causal effects in neural models with significant implications for reliability and fairness in safety-critical AI systems.
Abstract
Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward connections among input neurons. We propose an ante-hoc method that captures and maintains direct, indirect, and total causal effects during NN model training. We also propose an algorithm for quantifying learned causal effects in an NN model and efficient approximation strategies for quantifying causal effects in high-dimensional data. Extensive experiments conducted on synthetic and real-world datasets demonstrate that the causal effects learned by our ante-hoc method better approximate the ground truth effects compared to existing methods.
