TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins

Shovon Niverd Pereira; Krishna Khadka; Yu Lei

TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins

Shovon Niverd Pereira, Krishna Khadka, Yu Lei

Abstract

Data-free knowledge distillation enables model compression without original training data, critical for privacy-sensitive tabular domains. However, existing methods does not perform well on tabular data because they do not explicitly address feature interactions, the fundamental way tabular models encode predictive knowledge. We identify interaction diversity, systematic coverage of feature combinations, as an essential requirement for effective tabular distillation. To operationalize this insight, we propose TabKD, which learns adaptive feature bins aligned with teacher decision boundaries, then generates synthetic queries that maximize pairwise interaction coverage. Across 4 benchmark datasets and 4 teacher architectures, TabKD achieves highest student-teacher agreement in 14 out of 16 configurations, outperforming 5 state-of-the-art baselines. We further show that interaction coverage strongly correlates with distillation quality, validating our core hypothesis. Our work establishes interaction-focused exploration as a principled framework for tabular model extraction.

TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins

Abstract

Paper Structure (37 sections, 10 equations, 2 figures, 3 tables)

This paper contains 37 sections, 10 equations, 2 figures, 3 tables.

Introduction
Related Work
Model Extraction Attacks
Query-Based Extraction.
Data-Free Extraction.
Knowledge Distillation
Standard Distillation.
Data-Free Distillation.
Tabular Data Synthesis
Gap in Prior Work.
Background
Data-free Knowledge Distillation
T-way Combinatorial Testing
Approach
Approach Overview
...and 22 more sections

Figures (2)

Figure 1: TabKD Framework. The bin learner partitions each feature into semantically meaningful regions based on teacher predictions. The generator then produces samples maximizing pairwise interaction coverage across these bins, while a hardness objective targets student weaknesses. The student learns from this diverse, challenging synthetic data.
Figure 2: Overview of the Agreement Vs Interaction Coverage.

TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins

Abstract

TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins

Authors

Abstract

Table of Contents

Figures (2)