Causal Neural Probabilistic Circuits

Weixin Chen; Han Zhao

Causal Neural Probabilistic Circuits

Weixin Chen, Han Zhao

TL;DR

The Causal Neural Probabilistic Circuit (CNPC), which combines a neural attribute predictor with a causal probabilistic circuit compiled from a causal graph, is proposed, which supports exact, tractable causal inference that inherently respects causal dependencies.

Abstract

Concept Bottleneck Models (CBMs) enhance the interpretability of end-to-end neural networks by introducing a layer of concepts and predicting the class label from the concept predictions. A key property of CBMs is that they support interventions, i.e., domain experts can correct mispredicted concept values at test time to improve the final accuracy. However, typical CBMs apply interventions by overwriting only the corrected concept while leaving other concept predictions unchanged, which ignores causal dependencies among concepts. To address this, we propose the Causal Neural Probabilistic Circuit (CNPC), which combines a neural attribute predictor with a causal probabilistic circuit compiled from a causal graph. This circuit supports exact, tractable causal inference that inherently respects causal dependencies. Under interventions, CNPC models the class distribution based on a Product of Experts (PoE) that fuses the attribute predictor's predictive distribution with the interventional marginals computed by the circuit. We theoretically characterize the compositional interventional error of CNPC w.r.t. its modules and identify conditions under which CNPC closely matches the ground-truth interventional class distribution. Experiments on five benchmark datasets in both in-distribution and out-of-distribution settings show that, compared with five baseline models, CNPC achieves higher task accuracy across different numbers of intervened attributes.

Causal Neural Probabilistic Circuits

TL;DR

Abstract

Paper Structure (56 sections, 6 theorems, 37 equations, 10 figures, 1 table)

This paper contains 56 sections, 6 theorems, 37 equations, 10 figures, 1 table.

Introduction
Related Work
Modeling Concept Dependencies in CBMs.
Intervention Strategies.
Preliminaries
Notation and Assumptions
Neural Probabilistic Circuits
Causal Probabilistic Circuits
Method
Why is Modeling the Interventional Class Distribution Hard
Causal Neural Probabilistic Circuits
Theoretical Analysis
Remark.
Experiments
Experimental Settings
...and 41 more sections

Key Result

Theorem 3

The expected prediction error of NPC (CNPC), measured by KL divergence, is upper-bounded by the sum of the expected prediction error of the attribute predictor and that of the PC (causal PC). Specifically, Equality holds if for each $(x,y)$, there exists a constant $c(x,y)$ s.t. $\frac{\mathbb{P}_*(a_{1:K}\mid x)\mathbb{P}_*(y\mid a_{1:K})}{\mathbb{P}_{\theta}(a_{1:K}\mid x)\mathbb{P}_{w}(y\mid a

Figures (10)

Figure 1: A typical CBM (top left module + top right module) performs the intervention on $A_1$ by replacing the neural network's predictions for $A_1$ with the ground-truth distribution, while leaving the predictions for other attributes unchanged. In contrast, CNPC (top left module + bottom module) combines a neural network with a causal PC compiled from a causal graph, and approximates the interventional class distribution based on a PoE with a balancing weight $\alpha$ that fuses complementary information from the two modules.
Figure 2: A PC that compiles the causal graph $V_2 \leftarrow V_1 \rightarrow V_3$.
Figure 3: Task accuracy of CNPC and baseline models in the benign setting on the Asia, Sachs, MNISTAdd, and CelebA datasets under varying numbers of attribute interventions. All results are averaged across three random seeds.
Figure 4: Task accuracy of CNPC and baseline models in OOD settings on the MNISTAdd and CelebA datasets under varying numbers of attribute interventions. All results are averaged across three random seeds.
Figure 5: Task accuracy of CNPC in both benign and OOD settings with $\alpha$ varying from 0.0 to 1.0 in increments of 0.1. Left: Performance on MNISTAdd with one intervened attribute. Middle: Performance on CelebA with two intervened attributes. Right: Performance on CelebA with four intervened attributes. All results are averaged across three random seeds.
...and 5 more figures

Theorems & Definitions (9)

Theorem 3
Corollary 4
Corollary 5
Theorem 6: Restatement of \ref{['thm:benign_comp']}
proof
Corollary 7: Restatement of Corollary \ref{['thm:int_comp_npc']}
proof
Corollary 8: Restatement of Corollary \ref{['thm:int_comp_cnpc']}
proof

Causal Neural Probabilistic Circuits

TL;DR

Abstract

Causal Neural Probabilistic Circuits

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (9)