Table of Contents
Fetching ...

False positive control in time series coincidence detection

Ruiting Liang, Samuel Dyson, Rina Foygel Barber, Daniel E. Holz

TL;DR

The theoretical results establish rigorous finite-sample guarantees controlling the probability of false positives, under weak assumptions that allow for dependence within the time series data, providing reassurance that time-shifting methods are a reliable tool for inference in this setting.

Abstract

We study the problem of coincidence detection in time series data, where we aim to determine whether the appearance of simultaneous or near-simultaneous events in two time series is indicative of some shared underlying signal or synchronicity, or might simply be due to random chance. This problem arises across many applications, such as astrophysics (e.g., detecting astrophysical events such as gravitational waves, with two or more detectors) and neuroscience (e.g., detecting synchronous firing patterns between two or more neurons). In this work, we consider methods based on time-shifting, where the timeline of one data stream is randomly shifted relative to another, to mimic the types of coincidences that could occur by random chance. Our theoretical results establish rigorous finite-sample guarantees controlling the probability of false positives, under weak assumptions that allow for dependence within the time series data, providing reassurance that time-shifting methods are a reliable tool for inference in this setting. Empirical results with simulated and real data validate the strong performance of time-shifting methods in dependent-data settings.

False positive control in time series coincidence detection

TL;DR

The theoretical results establish rigorous finite-sample guarantees controlling the probability of false positives, under weak assumptions that allow for dependence within the time series data, providing reassurance that time-shifting methods are a reliable tool for inference in this setting.

Abstract

We study the problem of coincidence detection in time series data, where we aim to determine whether the appearance of simultaneous or near-simultaneous events in two time series is indicative of some shared underlying signal or synchronicity, or might simply be due to random chance. This problem arises across many applications, such as astrophysics (e.g., detecting astrophysical events such as gravitational waves, with two or more detectors) and neuroscience (e.g., detecting synchronous firing patterns between two or more neurons). In this work, we consider methods based on time-shifting, where the timeline of one data stream is randomly shifted relative to another, to mimic the types of coincidences that could occur by random chance. Our theoretical results establish rigorous finite-sample guarantees controlling the probability of false positives, under weak assumptions that allow for dependence within the time series data, providing reassurance that time-shifting methods are a reliable tool for inference in this setting. Empirical results with simulated and real data validate the strong performance of time-shifting methods in dependent-data settings.

Paper Structure

This paper contains 48 sections, 7 theorems, 126 equations, 11 figures.

Key Result

Theorem 1

Assume $\mathbf{X} = (X_1,\dots,X_T)$ and $\mathbf{Y} = (Y_1,\dots,Y_T)$ are each stationary time series. Then, if $\mathbf{X}\mathrel{\hbox{$\perp$}\mkern2mu{\perp}}\mathbf{Y}$, the time-shifted p-value $p_t$ defined in eqn:pval_event satisfies for any threshold $\alpha\in[0,1]$, and any choice of the function $\psi$. Consequently, for any (nonempty) subset of indices $\mathcal{T}\subseteq[T-\De

Figures (11)

  • Figure 1: An illustration of the intuition behind the time-shifting approach. The left panel shows a dataset with time series $\mathbf{X}$ and $\mathbf{Y}$, which exhibit some correlation. The star symbols in the figure mark times at which both $\mathbf{X}$ and $\mathbf{Y}$ exceed a prespecified threshold (indicated by the dotted lines). On the right, we see a time-shifted version of the same dataset, where the $\mathbf{Y}$ time series has been time-shifted relative to $\mathbf{X}$. We see that there are fewer stars (that is, there is less apparent association) within the time-shifted version of the data, suggesting that synchronicity may be present in the original data.
  • Figure 2: Left panel: $\Psi_t$ is calculated using the temporally aligned time window $\{t, \cdots, t+ \Delta\}$, as illustrated by the shaded segments between $t$ and $t+ \Delta$. Right panel: $\Psi_{i,j}$ is calculated with the time-shifted windows, which are illustrated by the shaded segment between $i$ and $i+\Delta$ in $\mathbf{X}$, and the shaded segment between $j$ and $j+\Delta$ in $\mathbf{Y}$.
  • Figure 3: An illustration of $\mathsf{mar}(t)$ as defined in \ref{['eqn:define_mar(t)']}. When computing $\mathsf{mar}(t)$, the numerator $\min\{t,T-\Delta+1-t\}^2$ is equal to minimum of the number of blue points, and the number of red points, shown in this figure. In contrast, the denominator $(T-\Delta)^2$ is approximately equal to the total number of points in the entire square.
  • Figure 4: Left panel: The test statistic $\Psi$ is calculated with temporally aligned data streams $\mathbf{X} = (X_1,\dots,X_T)$ and $\mathbf{Y} = (Y_1,\dots,Y_T)$. Right panel: $\Psi_{i,j}$ is calculated using the "wrap-around" data streams $(X_i,\dots,X_T,X_1,\dots,X_{i-1})$ and $(Y_j,\dots,Y_T,Y_1,\dots,Y_{j-1})$.
  • Figure 5: Illustration of a typical draw of the data for the simulated data experiment, with $\sigma=1$, $q=0.75$, and two different values of $\rho$. The top panel shows data generated with a small value $\rho=0.01$, corresponding to poor mixing of the time series---we see long periods of positive drift, indicating high temporal dependence. In contrast, the data in the bottom panel is generated with a larger value $\rho=0.5$, and the process appears to mix rapidly, with less dependence across time.
  • ...and 6 more figures

Theorems & Definitions (10)

  • Definition 1
  • Theorem 1
  • Definition 2
  • Theorem 2
  • Theorem 3
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Proposition 1
  • proof