Learning Exceptional Subgroups by End-to-End Maximizing KL-divergence

Sascha Xu; Nils Philipp Walter; Janis Kalofolias; Jilles Vreeken

Learning Exceptional Subgroups by End-to-End Maximizing KL-divergence

Sascha Xu, Nils Philipp Walter, Janis Kalofolias, Jilles Vreeken

TL;DR

Syflow addresses the problem of discovering exceptional subgroups by formulating subgroup discovery as a differentiable, KL-divergence maximization task. It integrates flexible target distribution modeling via normalizing flows with a differentiable neuro-symbolic rule learner to produce interpretable subgroups, while encouraging diversity and meaningful size. The approach scales to large datasets and handles complex target distributions, as demonstrated on synthetic data, real-world regression tasks, and a materials-science case study on gold nano-clusters. Overall, Syflow provides a practical, scalable framework for discovering diverse, physically plausible subgroups with human-readable descriptions, advancing descriptive analytics beyond traditional discretization-based methods.

Abstract

Finding and describing sub-populations that are exceptional regarding a target property has important applications in many scientific disciplines, from identifying disadvantaged demographic groups in census data to finding conductive molecules within gold nanoparticles. Current approaches to finding such subgroups require pre-discretized predictive variables, do not permit non-trivial target distributions, do not scale to large datasets, and struggle to find diverse results. To address these limitations, we propose Syflow, an end-to-end optimizable approach in which we leverage normalizing flows to model arbitrary target distributions, and introduce a novel neural layer that results in easily interpretable subgroup descriptions. We demonstrate on synthetic and real-world data, including a case study, that Syflow reliably finds highly exceptional subgroups accompanied by insightful descriptions.

Learning Exceptional Subgroups by End-to-End Maximizing KL-divergence

TL;DR

Abstract

Paper Structure (32 sections, 3 theorems, 25 equations, 8 figures, 4 tables)

This paper contains 32 sections, 3 theorems, 25 equations, 8 figures, 4 tables.

Introduction
Preliminaries
Method
Overview
Differentiable Rule Induction
Differentiable Density Estimation
Differentiable Exceptionality Measure
Rule Generality and Diversity
Full Model
Related Work
Subgroup Discovery.
Differentiable Rule Induction
Experiments
Synthetic Data
Target Distribution
...and 17 more sections

Key Result

Theorem 1

Given its lower and upper bounds $\alpha_i, \beta_i \in \R$, the soft predicate of Eq. eq:predicate applied on $x \in R$ converges to the crisp predicate that decides whether $x\in(\alpha,\beta)$,

Figures (8)

Figure 1: Subgroups. $\textsc{Syflow}$ learns subgroups, named subpopulations of which the distribution of the target variable is exceptional. In (a) $\textsc{Syflow}$ precisely describes the subgroup of "Women without higher education", whose distribution of the target quantity wage is significantly lower (b). In general, $\textsc{Syflow}$ is applicable on any data with non-trivial target distributions, e.g. material science (c).
Figure 2: Example soft predicate (a) and soft rule (b).
Figure 3: Subgroup Predictive Accuracy.(a) Method comparison in terms of F1-score recovering subgroups in synthetic data. (b) Across different distributions: $\textsc{Syflow}$ outperforms the competition on distributions with higher order moments. (c) With increasing number of cutpoints, $\textsc{Sd}\text{-}\mu$ matches $\textsc{Syflow}$ accuracy around 40 bins, but needs 10$\times$ more time.
Figure 4: Scalability of $\textsc{Syflow}$ and baselines.
Figure 5: Subgroups learned on the Insurance and Wages datasets. Only $\textsc{Syflow}$ learns diverse and exceptional subgroups.
...and 3 more figures

Theorems & Definitions (3)

Theorem 1
Theorem 2
Theorem 3

Learning Exceptional Subgroups by End-to-End Maximizing KL-divergence

TL;DR

Abstract

Learning Exceptional Subgroups by End-to-End Maximizing KL-divergence

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (3)