A Numerical Rosenblatt Method for Forced Variable Independence

Radek Vavřička; Tomáš Sýkora

A Numerical Rosenblatt Method for Forced Variable Independence

Radek Vavřička, Tomáš Sýkora

TL;DR

The paper addresses the problem of achieving quasi-independence between observables used in data-driven particle physics analyses, where the ABCD method requires background variables to be independent. It proposes a Rosenblatt-inspired framework that transforms one observable into a classifier $\gamma$ that is independent of the other for background-like data, preserving marginal distributions. Two numerical implementations, IRGI and KDE, construct $\gamma$ from finite samples: IRGI uses irregular grid binning to produce $\gamma_d$ and KDE uses Gaussian kernel smoothing to produce $\gamma_{\sigma_r}$, with explicit formulas and tunable parameters $d$ and $\sigma_r$. Across abstract blob cases, image classification tasks, and a high-energy physics dataset (LHC Olympics), the methods substantially reduce the distance correlation $\text{DCC}$ while maintaining discriminative power (AUC), enabling robust ABCD-based signal estimation and improved classifier independence in practice.

Abstract

A novel numerical technique is presented to transform one random variable within a system toward statistical quasi-independence from any other random variable in the system. The method's applicability is demonstrated through a particle physics example where a classifier is rendered quasi-independent from an observable quantity.

A Numerical Rosenblatt Method for Forced Variable Independence

TL;DR

that is independent of the other for background-like data, preserving marginal distributions. Two numerical implementations, IRGI and KDE, construct

from finite samples: IRGI uses irregular grid binning to produce

and KDE uses Gaussian kernel smoothing to produce

, with explicit formulas and tunable parameters

and

. Across abstract blob cases, image classification tasks, and a high-energy physics dataset (LHC Olympics), the methods substantially reduce the distance correlation

while maintaining discriminative power (AUC), enabling robust ABCD-based signal estimation and improved classifier independence in practice.

A Numerical Rosenblatt Method for Forced Variable Independence

TL;DR

Abstract

A Numerical Rosenblatt Method for Forced Variable Independence

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (31)