Table of Contents
Fetching ...

DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection

Hongyu Shen, Yici Yan, Zhizhen Zhao

TL;DR

In DeepDRK, a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks is introduced, and by employing an innovative perturbation technique, the model achieves lower FDR and higher power.

Abstract

Model-X knockoff has garnered significant attention among various feature selection methods due to its guarantees for controlling the false discovery rate (FDR). Since its introduction in parametric design, knockoff techniques have evolved to handle arbitrary data distributions using deep learning-based generative models. However, we have observed limitations in the current implementations of the deep Model-X knockoff framework. Notably, the "swap property" that knockoffs require often faces challenges at the sample level, resulting in diminished selection power. To address these issues, we develop "Deep Dependency Regularized Knockoff (DeepDRK)," a distribution-free deep learning method that effectively balances FDR and power. In DeepDRK, we introduce a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks. By employing an innovative perturbation technique, we achieve lower FDR and higher power. Our model outperforms existing benchmarks across synthetic, semi-synthetic, and real-world datasets, particularly when sample sizes are small and data distributions are non-Gaussian.

DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection

TL;DR

In DeepDRK, a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks is introduced, and by employing an innovative perturbation technique, the model achieves lower FDR and higher power.

Abstract

Model-X knockoff has garnered significant attention among various feature selection methods due to its guarantees for controlling the false discovery rate (FDR). Since its introduction in parametric design, knockoff techniques have evolved to handle arbitrary data distributions using deep learning-based generative models. However, we have observed limitations in the current implementations of the deep Model-X knockoff framework. Notably, the "swap property" that knockoffs require often faces challenges at the sample level, resulting in diminished selection power. To address these issues, we develop "Deep Dependency Regularized Knockoff (DeepDRK)," a distribution-free deep learning method that effectively balances FDR and power. In DeepDRK, we introduce a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks. By employing an innovative perturbation technique, we achieve lower FDR and higher power. Our model outperforms existing benchmarks across synthetic, semi-synthetic, and real-world datasets, particularly when sample sizes are small and data distributions are non-Gaussian.
Paper Structure (39 sections, 4 theorems, 26 equations, 15 figures, 9 tables)

This paper contains 39 sections, 4 theorems, 26 equations, 15 figures, 9 tables.

Key Result

Theorem 2.1

Given the knockoff that satisfies the swap property in Eq. eq_paire_wise_check, the knockoff statistic that satisfies Eq. knockoff_proeprty_flip_coin, and $\mathcal{S} = \{ w_j \ge \tau_q \}$, we have $\mathrm{FDR} \leq q$.

Figures (15)

  • Figure 1: The illustration of the DeepDRK pipeline, which consists of two components: 1. the training stage that optimizes the knockoff Transformer and swappers by $\mathcal{L}_\text{SL}$ and $\mathcal{L}_\text{DRL}$; 2. the post-training stage that generates the knockoff $\tilde{X}^{\text{DRP}_{\theta}}$ via dependency regularized perturbation.
  • Figure 2: Power and FDR for different knockoff models on the synthetic datasets with $\beta \sim\frac{p}{15\cdot \sqrt{N}}\cdot \text{Rademacher(0.5)}$. The red horizontal line indicates the 0.1 FDR threshold.
  • Figure 3: Power and FDR for different knockoff models on the mixture of Gaussian data on different $\rho_\text{base}$ setups. The red horizontal line indicates the 0.1 FDR threshold. This figure is complementary to Figure \ref{['fig_syn150']} for including two additional Gaussian mixture data with higher $\rho_{\text{base}}$ values.
  • Figure 4: The knockoff statistics ($w_j$) for different knockoff models on the synthetic datasets with $\beta \sim\frac{p}{15\cdot \sqrt{N}}\cdot \text{Rademacher(0.5)}$. Each bar in the plot represents the mean of the null/nonnull knockoff statistics averaging on 600 experiments. The error bar indicates the standard deviation. The sample size is 200.
  • Figure 5: Scatter plots of Power against FDR for different datasets and models. The red vertical line indicates the 0.1 FDR threshold. Different scales for $\beta$ (e.g., $\frac{p}{5\cdot \sqrt{n}}$, $\frac{p}{10\cdot \sqrt{n}}$, $\frac{p}{15\cdot \sqrt{n}}$ and $\frac{p}{20\cdot \sqrt{n}}$) are indicated by different marker styles. Different models are indicated by different colors.
  • ...and 10 more figures

Theorems & Definitions (4)

  • Theorem 2.1
  • Lemma 3.1
  • Proposition 3.2
  • Theorem A.1