Table of Contents
Fetching ...

Independence Testing for Temporal Data

Cencheng Shen, Jaewon Chung, Ronak Mehta, Ting Xu, Joshua T. Vogelstein

TL;DR

This work develops a nonparametric temporal independence test for stationary time series by aggregating cross-lag dependence statistics across lags and selecting an optimal lag via $\hat{L}^*$. It combines a temporal dependence statistic $\mathrm{T}_n$ with a block permutation approach to obtain valid $p$-values without multiple testing, and it supports multiple dependence measures such as $\mathrm{DCorr}$, $\mathrm{HSIC}$, and $\mathrm{MGC}$. The authors prove asymptotic validity under the null and universal consistency under alternatives, and demonstrate strong empirical performance in both simulations and real-data applications (fMRI connectivity and stock market analysis). The methodology offers a flexible, scalable framework for detecting both linear and nonlinear temporal dependencies, with practical guidance on lag selection and computational considerations, enabling broader use in neuroscience, finance, and beyond.

Abstract

Temporal data are increasingly prevalent in modern data science. A fundamental question is whether two time series are related or not. Existing approaches often have limitations, such as relying on parametric assumptions, detecting only linear associations, and requiring multiple tests and corrections. While many non-parametric and universally consistent dependence measures have recently been proposed, directly applying them to temporal data can inflate the p-value and result in an invalid test. To address these challenges, this paper introduces the temporal dependence statistic with block permutation to test independence between temporal data. Under proper assumptions, the proposed procedure is asymptotically valid and universally consistent for testing independence between stationary time series, and capable of estimating the optimal dependence lag that maximizes the dependence. Moreover, it is compatible with a rich family of distance and kernel based dependence measures, eliminates the need for multiple testing, and exhibits excellent testing power in various simulation settings.

Independence Testing for Temporal Data

TL;DR

This work develops a nonparametric temporal independence test for stationary time series by aggregating cross-lag dependence statistics across lags and selecting an optimal lag via . It combines a temporal dependence statistic with a block permutation approach to obtain valid -values without multiple testing, and it supports multiple dependence measures such as , , and . The authors prove asymptotic validity under the null and universal consistency under alternatives, and demonstrate strong empirical performance in both simulations and real-data applications (fMRI connectivity and stock market analysis). The methodology offers a flexible, scalable framework for detecting both linear and nonlinear temporal dependencies, with practical guidance on lag selection and computational considerations, enabling broader use in neuroscience, finance, and beyond.

Abstract

Temporal data are increasingly prevalent in modern data science. A fundamental question is whether two time series are related or not. Existing approaches often have limitations, such as relying on parametric assumptions, detecting only linear associations, and requiring multiple tests and corrections. While many non-parametric and universally consistent dependence measures have recently been proposed, directly applying them to temporal data can inflate the p-value and result in an invalid test. To address these challenges, this paper introduces the temporal dependence statistic with block permutation to test independence between temporal data. Under proper assumptions, the proposed procedure is asymptotically valid and universally consistent for testing independence between stationary time series, and capable of estimating the optimal dependence lag that maximizes the dependence. Moreover, it is compatible with a rich family of distance and kernel based dependence measures, eliminates the need for multiple testing, and exhibits excellent testing power in various simulation settings.

Paper Structure

This paper contains 25 sections, 8 theorems, 60 equations, 9 figures, 1 table.

Key Result

Theorem 1

The cross dependence sample statistic satisfies: Therefore, for each $l \in \{0,...,L\}$, we have in probability.

Figures (9)

  • Figure 1: This figure illustrates the validity of the tests using two independent time series. In the left panel, the testing power is computed as the sample size increases, with an AR coefficient of $\phi=0.5$. The right panel keeps the sample size at $n=1200$ while varying the AR coefficient $\phi$, with the noise variance appropriately adjusted by $(1 - \phi^2)$, based on the same simulation as in shifthsic. The dashed black line represents the significance level $\alpha=0.05$.
  • Figure 2: The testing power for linear (left panel) and nonlinear (right panel) simulations based on $300$ replicates.
  • Figure 3: The testing power for the extinct gaussian simulation based on $300$ replicates.
  • Figure 4: This figure displays the performance of our proposed method using both MGC and DCorr for estimating the optimal dependence lag $\hat{L}^{*}$ in linear and nonlinear relationships. The colored bar above lag $l$ shows the empirical frequency of $\hat{L}^{*}=j$, with red representing MGC and purple representing DCorr. The probability is estimated based on $100$ trials. The first row shows DCorr estimation performance at sample sizes $n=15, 30, 60$ for linear relationships, while the second row shows the MGC performance on the same data. The third row displays DCorr estimation for nonlinear relationships, and the last row presents the same for MGC.
  • Figure 5: This figure shows the testing power for multivariate simulations, with a constant sample size of $n=100$ while increasing the dimensionality.
  • ...and 4 more figures

Theorems & Definitions (12)

  • Theorem 1
  • Theorem 2
  • Theorem 3: Asymptotic Validity
  • Theorem 4: Testing Consistency
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3: Asymptotic Validity
  • proof
  • ...and 2 more