Table of Contents
Fetching ...

Neural Spectral Decomposition for Dataset Distillation

Shaolei Yang, Shen Cheng, Mingbo Hong, Haoqiang Fan, Xing Wei, Shuaicheng Liu

TL;DR

This paper proposes Neural Spectrum Decomposition, a generic decomposition framework for dataset distillation that considers the entire dataset as a high-dimensional observation that is low-rank across all dimensions.

Abstract

In this paper, we propose Neural Spectrum Decomposition, a generic decomposition framework for dataset distillation. Unlike previous methods, we consider the entire dataset as a high-dimensional observation that is low-rank across all dimensions. We aim to discover the low-rank representation of the entire dataset and perform distillation efficiently. Toward this end, we learn a set of spectrum tensors and transformation matrices, which, through simple matrix multiplication, reconstruct the data distribution. Specifically, a spectrum tensor can be mapped back to the image space by a transformation matrix, and efficient information sharing during the distillation learning process is achieved through pairwise combinations of different spectrum vectors and transformation matrices. Furthermore, we integrate a trajectory matching optimization method guided by a real distribution. Our experimental results demonstrate that our approach achieves state-of-the-art performance on benchmarks, including CIFAR10, CIFAR100, Tiny Imagenet, and ImageNet Subset. Our code are available at \url{https://github.com/slyang2021/NSD}.

Neural Spectral Decomposition for Dataset Distillation

TL;DR

This paper proposes Neural Spectrum Decomposition, a generic decomposition framework for dataset distillation that considers the entire dataset as a high-dimensional observation that is low-rank across all dimensions.

Abstract

In this paper, we propose Neural Spectrum Decomposition, a generic decomposition framework for dataset distillation. Unlike previous methods, we consider the entire dataset as a high-dimensional observation that is low-rank across all dimensions. We aim to discover the low-rank representation of the entire dataset and perform distillation efficiently. Toward this end, we learn a set of spectrum tensors and transformation matrices, which, through simple matrix multiplication, reconstruct the data distribution. Specifically, a spectrum tensor can be mapped back to the image space by a transformation matrix, and efficient information sharing during the distillation learning process is achieved through pairwise combinations of different spectrum vectors and transformation matrices. Furthermore, we integrate a trajectory matching optimization method guided by a real distribution. Our experimental results demonstrate that our approach achieves state-of-the-art performance on benchmarks, including CIFAR10, CIFAR100, Tiny Imagenet, and ImageNet Subset. Our code are available at \url{https://github.com/slyang2021/NSD}.
Paper Structure (15 sections, 4 equations, 3 figures, 6 tables, 1 algorithm)

This paper contains 15 sections, 4 equations, 3 figures, 6 tables, 1 algorithm.

Figures (3)

  • Figure 1: The overall pipeline. The core lies in the decomposition approach of the image generation component. In terms of training strategy, in addition to using trajectory matching, we also incorporate guidance from the distribution of real samples.
  • Figure 2: Accuracy at different number(t3,t1) on the CIFAR10 dataset with IPC=1/10.
  • Figure 3: We compute the similarity between B-dimension, H-dimension, and W-dimension, respectively, for the original graph with dimensions BCHW.