Table of Contents
Fetching ...

Practical Kernel Tests of Conditional Independence

Roman Pogodin, Antonin Schrab, Yazhe Li, Danica J. Sutherland, Arthur Gretton

TL;DR

The paper tackles kernel-based conditional independence testing and the bias introduced by conditional mean embedding (CME) estimation. It introduces SplitKCI, a debiased CI statistic that uses independent data splits to reduce CME-induced bias, paired with a train/test split heuristic to balance Type I and II errors. The authors prove consistency and wild bootstrap validity for SplitKCI, and demonstrate through extensive synthetic and real-data experiments that SplitKCI maintains the nominal level while achieving higher power than existing kernels-based and non-kernel CI tests. The approach yields practical, data-efficient CI testing with strong level control and competitive sensitivity, applicable to complex, nonlinear dependencies. It also discusses extensions, parametric alternatives, and interpretability considerations for sensitive domains.

Abstract

We describe a data-efficient, kernel-based approach to statistical testing of conditional independence. A major challenge of conditional independence testing is to obtain the correct test level (the specified upper bound on the rate of false positives), while still attaining competitive test power. Excess false positives arise due to bias in the test statistic, which is in our case obtained using nonparametric kernel ridge regression. We propose SplitKCI, an automated method for bias control for the Kernel-based Conditional Independence (KCI) test based on data splitting. We show that our approach significantly improves test level control for KCI without sacrificing test power, both theoretically and for synthetic and real-world data.

Practical Kernel Tests of Conditional Independence

TL;DR

The paper tackles kernel-based conditional independence testing and the bias introduced by conditional mean embedding (CME) estimation. It introduces SplitKCI, a debiased CI statistic that uses independent data splits to reduce CME-induced bias, paired with a train/test split heuristic to balance Type I and II errors. The authors prove consistency and wild bootstrap validity for SplitKCI, and demonstrate through extensive synthetic and real-data experiments that SplitKCI maintains the nominal level while achieving higher power than existing kernels-based and non-kernel CI tests. The approach yields practical, data-efficient CI testing with strong level control and competitive sensitivity, applicable to complex, nonlinear dependencies. It also discusses extensions, parametric alternatives, and interpretability considerations for sensitive domains.

Abstract

We describe a data-efficient, kernel-based approach to statistical testing of conditional independence. A major challenge of conditional independence testing is to obtain the correct test level (the specified upper bound on the rate of false positives), while still attaining competitive test power. Excess false positives arise due to bias in the test statistic, which is in our case obtained using nonparametric kernel ridge regression. We propose SplitKCI, an automated method for bias control for the Kernel-based Conditional Independence (KCI) test based on data splitting. We show that our approach significantly improves test level control for KCI without sacrificing test power, both theoretically and for synthetic and real-world data.
Paper Structure (41 sections, 8 theorems, 73 equations, 20 figures, 3 algorithms)

This paper contains 41 sections, 8 theorems, 73 equations, 20 figures, 3 algorithms.

Key Result

Theorem 2

Each of the following conditions hold if and only if $A \perp \!\!\! \perp B \,|\, C$,The condition eq:daudin-one-and-a-half is not explicitly stated by daudin1980partial, but follows from eq:daudin-one; see pogodin2022efficient, summarised in app:sec:kci. where in each case the test functions $h$ a

Figures (20)

  • Figure 1: RatInABox data visualisation. A. Simulated rat trajectory in a square box with a blocked off centre. B. Activation patterns of three grid cells w.r.t. rat's position in the box. C. Simulated neural activity of the cells in B. Black dots indicate which data points are used in the dataset. D. Activation patterns of three head direction cells w.r.t. rat's head direction (polar coordinates). E. Same as C, but for the head direction cells.
  • Figure 2: Train/test splitting in a post-nonlinear model with $d=4$ and a fixed dataset size $N=n+m$ (for $n$ test and $m$ training points). A. Type I error (top) vs. rejection rate for the train/test split heuristic (bottom) for different dataset sizes $N$ and test to train split ratios. B. Type II error (top) vs. rejection rate for the train/test split heuristic (bottom) for different dataset sizes $N$ and split ratios. Lines/shaded area: mean/$\pm$SE over 100 trials, $\alpha=0.05$. Asterisks: approximate split ratios from \ref{['alg:train_test_split']} to show performance of the splitting heuristic.
  • Figure 3: Post-nonlinear model experiments for increasing dimensionality of the task and a fixed dataset size $N=n+m$ (for $n$ test and $m$ training points). A. Type I error for $N=200$ (left) and $N=400$ (right) data points. B. Type II error for $N=200$ (left) and $N=400$ (right) data points. Lines/shaded area: mean/$\pm$SE over 100 trials, $\alpha=0.05$.
  • Figure 4: Train/test splitting in the synthetic neural data. Type I error (top) vs. rejection rate for the train/test split heuristic (bottom) for different dataset sizes $N=n+m$ and test to train ($n/m$) split ratios. Lines/shaded area: mean/$\pm$SE over 100 trials, $\alpha=0.05$. Asterisks: approximate split ratios from \ref{['alg:train_test_split']} to show performance of the splitting heuristic.
  • Figure 5: Train/test splitting in the synthetic neural data. Type II error (top) vs. rejection rate for the train/test split heuristic (bottom) for different dataset sizes $N=n+m$ and test to train ($n/m$) split ratios. Lines/shaded area: mean/$\pm$SE over 100 trials, $\alpha=0.05$. Asterisks: approximate split ratios from \ref{['alg:train_test_split']} to show performance of the splitting heuristic.
  • ...and 15 more figures

Theorems & Definitions (11)

  • Definition 1: daudin1980partial
  • Theorem 2: daudin1980partial
  • Theorem 3
  • Definition 4: SplitKCI
  • Definition 5: $\beta$-interpolation space, steinwart2012mercerli2023optimal
  • Theorem 6: Bias in KCI and SplitKCI
  • Theorem 7: Wild bootstrap
  • Theorem 7
  • Theorem 7: Bias in KCI and SplitKCI
  • Theorem 7: Wild bootstrap
  • ...and 1 more