Table of Contents
Fetching ...

C-HDNet: A Fast Hyperdimensional Computing Based Method for Causal Effect Estimation from Networked Observational Data

Abhishek Dalvi, Neil Ashtekar, Vasant Honavar

TL;DR

C-HDNet targets causal effect estimation from networked observational data by combining hyperdimensional computing with network-aware matching. It maps covariates to high-dimensional vectors, encodes 1-hop and 2-hop neighborhood information via RelHD-inspired HD representations, and performs KNN-based outcome prediction to estimate counterfactuals. The approach is training-free and significantly faster than deep learning baselines while achieving competitive or superior causal effect error metrics on BlogCatalog and Flickr data, with robust ablations illustrating the value of network information. This yields a practical, scalable method for network deconfounding that can be extended to temporal networks and other causal inference tasks.

Abstract

We consider the problem of estimating causal effects from observational data in the presence of network confounding. In this context, an individual's treatment assignment and outcomes may be affected by their neighbors within the network. We propose a novel matching technique which leverages hyperdimensional computing to model network information and improve predictive performance. We present results of extensive experiments which show that the proposed method outperforms or is competitive with the state-of-the-art methods for causal effect estimation from network data, including advanced computationally demanding deep learning methods. Further, our technique benefits from simplicity and speed, with roughly an order of magnitude lower runtime compared to state-of-the-art methods, while offering similar causal effect estimation error rates.

C-HDNet: A Fast Hyperdimensional Computing Based Method for Causal Effect Estimation from Networked Observational Data

TL;DR

C-HDNet targets causal effect estimation from networked observational data by combining hyperdimensional computing with network-aware matching. It maps covariates to high-dimensional vectors, encodes 1-hop and 2-hop neighborhood information via RelHD-inspired HD representations, and performs KNN-based outcome prediction to estimate counterfactuals. The approach is training-free and significantly faster than deep learning baselines while achieving competitive or superior causal effect error metrics on BlogCatalog and Flickr data, with robust ablations illustrating the value of network information. This yields a practical, scalable method for network deconfounding that can be extended to temporal networks and other causal inference tasks.

Abstract

We consider the problem of estimating causal effects from observational data in the presence of network confounding. In this context, an individual's treatment assignment and outcomes may be affected by their neighbors within the network. We propose a novel matching technique which leverages hyperdimensional computing to model network information and improve predictive performance. We present results of extensive experiments which show that the proposed method outperforms or is competitive with the state-of-the-art methods for causal effect estimation from network data, including advanced computationally demanding deep learning methods. Further, our technique benefits from simplicity and speed, with roughly an order of magnitude lower runtime compared to state-of-the-art methods, while offering similar causal effect estimation error rates.

Paper Structure

This paper contains 17 sections, 10 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The causal diagram for the networked observational data problem setting. The covariates or features for nodes/individuals are represented by $\mathbf{X}$, the network structure is represented by $\mathbf{A}$. $\mathbf{Z}$ is a latent representation, thus represented by a dotted circle. Ideally, conditioning on $\mathbf{Z}$ should deconfound the causal effect of treatment $\mathbf{T}$ on outcome $\mathbf{Y}$.
  • Figure 2: Runtime comparison between C-HDNet and NetDeconf across different datasets. BlogCat-R and Flickr-R refer to the datasets generated using randomly sampled 0-hop, 1-hop, and 2-hop confounding factors, while the remaining datasets are from Guo et al. guo2020learningguo2020counterfactual.
  • Figure 3: C-HDNet error from network ablation studies: The 0-hop plot represents HD representation without network data, the 1-hop plot includes only first-hop neighborhood information, and the 2-hop plot incorporates both first and second-hop neighborhood data.