Table of Contents
Fetching ...

Decomposed Distribution Matching in Dataset Condensation

Sahar Rahimi Malakshan, Mohammad Saeed Ebrahimi Saadabadi, Ali Dabouei, Nasser M. Nasrabadi

TL;DR

This paper tackles the efficiency–performance gap in Dataset Condensation by decomposing distribution matching into content and style. It introduces Style Matching (MM and CM) to align style via first/second feature-map moments and cross-map correlations, and Intra-Class Diversity (ICD) using KL-divergence with a kNN constraint to diversify condensed samples. The condensed dataset is learned by minimizing a joint objective that combines style and content terms, $L_S = \alpha L_{MM} + L_{CM}$ and $L_C = \beta L_{ICD} + L_{MMD}$, with the overall optimization $\mathcal{S}^* = \arg\min (\lambda L_S + L_C)$. Across CIFAR10/100, TinyImageNet, ImageNet-1K subsets, and high-resolution datasets, the method yields consistent improvements over DM, scales to multiple architectures, and extends to continual learning, while keeping computational efficiency. The work provides a practical framework to produce diverse, style-aligned condensed data suitable for large-scale training and continual learning applications.

Abstract

Dataset Condensation (DC) aims to reduce deep neural networks training efforts by synthesizing a small dataset such that it will be as effective as the original large dataset. Conventionally, DC relies on a costly bi-level optimization which prohibits its practicality. Recent research formulates DC as a distribution matching problem which circumvents the costly bi-level optimization. However, this efficiency sacrifices the DC performance. To investigate this performance degradation, we decomposed the dataset distribution into content and style. Our observations indicate two major shortcomings of: 1) style discrepancy between original and condensed data, and 2) limited intra-class diversity of condensed dataset. We present a simple yet effective method to match the style information between original and condensed data, employing statistical moments of feature maps as well-established style indicators. Moreover, we enhance the intra-class diversity by maximizing the Kullback-Leibler divergence within each synthetic class, i.e., content. We demonstrate the efficacy of our method through experiments on diverse datasets of varying size and resolution, achieving improvements of up to 4.1% on CIFAR10, 4.2% on CIFAR100, 4.3% on TinyImageNet, 2.0% on ImageNet-1K, 3.3% on ImageWoof, 2.5% on ImageNette, and 5.5% in continual learning accuracy.

Decomposed Distribution Matching in Dataset Condensation

TL;DR

This paper tackles the efficiency–performance gap in Dataset Condensation by decomposing distribution matching into content and style. It introduces Style Matching (MM and CM) to align style via first/second feature-map moments and cross-map correlations, and Intra-Class Diversity (ICD) using KL-divergence with a kNN constraint to diversify condensed samples. The condensed dataset is learned by minimizing a joint objective that combines style and content terms, and , with the overall optimization . Across CIFAR10/100, TinyImageNet, ImageNet-1K subsets, and high-resolution datasets, the method yields consistent improvements over DM, scales to multiple architectures, and extends to continual learning, while keeping computational efficiency. The work provides a practical framework to produce diverse, style-aligned condensed data suitable for large-scale training and continual learning applications.

Abstract

Dataset Condensation (DC) aims to reduce deep neural networks training efforts by synthesizing a small dataset such that it will be as effective as the original large dataset. Conventionally, DC relies on a costly bi-level optimization which prohibits its practicality. Recent research formulates DC as a distribution matching problem which circumvents the costly bi-level optimization. However, this efficiency sacrifices the DC performance. To investigate this performance degradation, we decomposed the dataset distribution into content and style. Our observations indicate two major shortcomings of: 1) style discrepancy between original and condensed data, and 2) limited intra-class diversity of condensed dataset. We present a simple yet effective method to match the style information between original and condensed data, employing statistical moments of feature maps as well-established style indicators. Moreover, we enhance the intra-class diversity by maximizing the Kullback-Leibler divergence within each synthetic class, i.e., content. We demonstrate the efficacy of our method through experiments on diverse datasets of varying size and resolution, achieving improvements of up to 4.1% on CIFAR10, 4.2% on CIFAR100, 4.3% on TinyImageNet, 2.0% on ImageNet-1K, 3.3% on ImageWoof, 2.5% on ImageNette, and 5.5% in continual learning accuracy.

Paper Structure

This paper contains 21 sections, 11 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: (a, b) 2D t-SNE visualizations of original and condensed images learned by DM zhao2023dataset for CIFAR10 with IPC=50 in randomly chosen category. (a) Style statistics (concatenation of mean and variance) from the first layer's feature map, highlighting a significant style discrepancy. (b) Final features of the DNN, showing limited diversity of instances learned by DM. (c, d) Illustrating the negative effect of style discrepancy on performance. During training the style of samples from Herding is rebuffi2017icarl drifted toward that of DM zhao2023dataset, with $\gamma$ representing the drift ratio.
  • Figure 2: (a) Visualization of the proposed method, which includes a Style Matching (SM) module and Intra-Class Diversity (ICD) components. (b) SM module includes Moments Matching (MM) and Correlation Matching (CM) losses to reduce style discrepancies between real and condensed sets by using the i.e., mean and variance of feature maps as well as correlation among feature maps captured by the Gram matrix in a DNN across different layers. Meanwhile, the ICD component enhances diversity within condensed sets by pushing each condensed sample away from its $k$ nearest intra-class neighbors.
  • Figure 3: (a) Ablation on loss components on CIFAR10 with IPC=10 by employing ConvNet. b) Evaluation in continual learning for CIFAR100 in five steps, i.e., 20 classs per step. Shaded regions show the performance tolerance.
  • Figure 4: (a,b) Intra-class diversity of two randomly selected classes of CIFAR10 with (a) IPC=50 and (b) IPC=10. Our method enhances diversity across both IPCs, addressing the limited intra-class diversity issue in DM. (c, d, e) Visualizations samples from (c) original and (d) condensed DM zhao2023dataset and (e) our method for CIFAR10 with IPC=10. Both methods are initialized from real samples. The proposed method improves visual quality and diversity.