Table of Contents
Fetching ...

Online FDR Controlling procedures for statistical SIS Model and its application to COVID19 data

Seohwa Hwang, Junyong Park

TL;DR

The proposed approach outperforms existing methods by achieving higher detection power while maintaining rigorous FDR control, and achieves higher statistical power than existing approaches by leveraging LIS, which has been shown to be more powerful than traditional $p$-value-based methods.

Abstract

We propose an online false discovery rate (FDR) controlling method based on conditional local FDR (LIS), designed for infectious disease datasets that are discrete and exhibit complex dependencies. Unlike existing online FDR methods, which often assume independence or suffer from low statistical power in dependent settings, our approach effectively controls FDR while maintaining high detection power in realistic epidemic scenarios. For disease modeling, we establish a Dynamic Bayesian Network (DBN) structure within the Susceptible-Infected-Susceptible (SIS) model, a widely used epidemiological framework for infectious diseases. Our method requires no additional tuning parameters apart from the width of the sliding window, making it practical for real-time disease monitoring. From a statistical perspective, we prove that our method ensures valid FDR control under stationary and ergodic dependencies, extending online hypothesis testing to a broader range of dependent and discrete datasets. Additionally, our method achieves higher statistical power than existing approaches by leveraging LIS, which has been shown to be more powerful than traditional $p$-value-based methods. We validate our method through extensive simulations and real-world applications, including the analysis of infectious disease incidence data. Our results demonstrate that the proposed approach outperforms existing methods by achieving higher detection power while maintaining rigorous FDR control.

Online FDR Controlling procedures for statistical SIS Model and its application to COVID19 data

TL;DR

The proposed approach outperforms existing methods by achieving higher detection power while maintaining rigorous FDR control, and achieves higher statistical power than existing approaches by leveraging LIS, which has been shown to be more powerful than traditional -value-based methods.

Abstract

We propose an online false discovery rate (FDR) controlling method based on conditional local FDR (LIS), designed for infectious disease datasets that are discrete and exhibit complex dependencies. Unlike existing online FDR methods, which often assume independence or suffer from low statistical power in dependent settings, our approach effectively controls FDR while maintaining high detection power in realistic epidemic scenarios. For disease modeling, we establish a Dynamic Bayesian Network (DBN) structure within the Susceptible-Infected-Susceptible (SIS) model, a widely used epidemiological framework for infectious diseases. Our method requires no additional tuning parameters apart from the width of the sliding window, making it practical for real-time disease monitoring. From a statistical perspective, we prove that our method ensures valid FDR control under stationary and ergodic dependencies, extending online hypothesis testing to a broader range of dependent and discrete datasets. Additionally, our method achieves higher statistical power than existing approaches by leveraging LIS, which has been shown to be more powerful than traditional -value-based methods. We validate our method through extensive simulations and real-world applications, including the analysis of infectious disease incidence data. Our results demonstrate that the proposed approach outperforms existing methods by achieving higher detection power while maintaining rigorous FDR control.
Paper Structure (22 sections, 2 theorems, 29 equations, 8 figures, 2 tables, 2 algorithms)

This paper contains 22 sections, 2 theorems, 29 equations, 8 figures, 2 tables, 2 algorithms.

Key Result

Theorem 4.1

Assuming the following assumptions hold: Applying Algorithm alg:onlineFDR, the adaptive barrier obtained with significance level $\alpha$ satisfies at least one of the followings:

Figures (8)

  • Figure 1: Dynamic Bayesian network (DBN) for the proposed model with latent state sequence $\{\theta_t\}_{t\ge1}$. The initial distribution is $P(\theta_1=i)=\pi_i$ for $i\in\{1,2,3\}$, and the transition law is $P(\theta_{t+1}=i \mid \theta_t=j)=a_{ij}$ for $i,j\in\{1,2,3\}$.
  • Figure 2: Generated data with a fixed seed (seed = 1) and parameter values $(\gamma_1, \gamma_2, \gamma_3) = (0.8, 1, 1.2)$. The left plot shows the trend without seasonal effects, while the right plot includes seasonal effects, highlighting periodic fluctuations. The shaded areas indicate periods of increasing values, representing days that should be rejected.
  • Figure 3: Online FDR analysis of daily infectious counts ($J_t$) in Australia. The top two panels show the raw and the log-scaled/smoothed data, respectively. Grey areas mark training periods. The lower four panels show the results for four different online FDR procedures, with red markers indicating rejection times.
  • Figure 4: Online FDR analysis of daily infectious counts ($J_t$) in South Korea. The top two panels show the raw and the log-scaled/smoothed data, respectively. Grey areas mark training periods. The lower four panels show the results for four different online FDR procedures, with red markers indicating rejection times.
  • Figure 5: $J_t$ of Mycoplasma Pnumoniae in South Korea with $15$ weeks of training periods (grey), and rejections by online FDR controlling procedures (red).
  • ...and 3 more figures

Theorems & Definitions (4)

  • Theorem 4.1: Adaptive Barrier
  • proof
  • Theorem 4.2: FDR control
  • proof