A Unified View of Optimal Kernel Hypothesis Testing

Antonin Schrab

A Unified View of Optimal Kernel Hypothesis Testing

Antonin Schrab

TL;DR

This work presents a unified, minimax-centered view of kernel-based hypothesis testing across three core problems: two-sample testing (MMD), independence testing (HSIC), and goodness-of-fit testing (KSD). It develops adaptive kernel strategies (aggregation and kernel pooling) to achieve power without knowing the optimal kernel, and analyzes uniform separation rates under standard, efficient, private, and robust settings. The results establish minimax-optimal rates for both kernel-based and $L^2$-based alternatives, while detailing practical constructions via permutations and wild bootstrap for non-asymptotic level control. It also highlights open problems, notably in differential privacy lower bounds for $L^2$ separation and in private/robust KSD-GoF testing. Collectively, the framework provides a cohesive blueprint for kernel-based tests with rigorous power guarantees and practical adaptation under realistic constraints, enabling reliable nonparametric hypothesis testing in large-scale settings.

Abstract

This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with two adaptive kernel selection methods (kernel pooling and aggregation), and under various testing constraints: computational efficiency, differential privacy, and robustness to data corruption. Intuition behind the derivation of the power results is provided in a unified way across the three frameworks, and open problems are highlighted.

A Unified View of Optimal Kernel Hypothesis Testing

TL;DR

-based alternatives, while detailing practical constructions via permutations and wild bootstrap for non-asymptotic level control. It also highlights open problems, notably in differential privacy lower bounds for

separation and in private/robust KSD-GoF testing. Collectively, the framework provides a cohesive blueprint for kernel-based tests with rigorous power guarantees and practical adaptation under realistic constraints, enabling reliable nonparametric hypothesis testing in large-scale settings.

Abstract

metrics are presented, with two adaptive kernel selection methods (kernel pooling and aggregation), and under various testing constraints: computational efficiency, differential privacy, and robustness to data corruption. Intuition behind the derivation of the power results is provided in a unified way across the three frameworks, and open problems are highlighted.

A Unified View of Optimal Kernel Hypothesis Testing

TL;DR

Abstract

A Unified View of Optimal Kernel Hypothesis Testing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)