Using Low-Discrepancy Points for Data Compression in Machine Learning: An Experimental Comparison

Simone Göttlich; Jacob Heieck; Andreas Neuenkirch

Using Low-Discrepancy Points for Data Compression in Machine Learning: An Experimental Comparison

Simone Göttlich, Jacob Heieck, Andreas Neuenkirch

TL;DR

This work investigates data reduction for regression and neural-network training using low-discrepancy points (Quasi-Monte Carlo). It compares two QMC-based compression schemes (QMC-averaging and QMC-Voronoi) to the adaptive supercompress method, highlighting deterministic error bounds for the QMC approaches and empirical performance across synthetic test functions and MNIST. The results show that adaptive clustering via the standard supercompress approach consistently outperforms the QMC methods on real-world, high-dimensional data, while QMC-Voronoi offers competitive performance on simple, regular problems but fails to scale to MNIST. The findings suggest that for complex data, output-space–focused clustering with adaptive refinement provides the most reliable compression for maintaining predictive accuracy while reducing training cost, whereas QMC-based guarantees are most beneficial in regular settings.

Abstract

Low-discrepancy points (also called Quasi-Monte Carlo points) are deterministically and cleverly chosen point sets in the unit cube, which provide an approximation of the uniform distribution. We explore two methods based on such low-discrepancy points to reduce large data sets in order to train neural networks. The first one is the method of Dick and Feischl [4], which relies on digital nets and an averaging procedure. Motivated by our experimental findings, we construct a second method, which again uses digital nets, but Voronoi clustering instead of averaging. Both methods are compared to the supercompress approach of [14], which is a variant of the K-means clustering algorithm. The comparison is done in terms of the compression error for different objective functions and the accuracy of the training of a neural network.

Using Low-Discrepancy Points for Data Compression in Machine Learning: An Experimental Comparison

TL;DR

Abstract

Paper Structure (15 sections, 3 theorems, 45 equations, 6 figures, 10 tables)

This paper contains 15 sections, 3 theorems, 45 equations, 6 figures, 10 tables.

Introduction
Low-discrepancy point sets and regression
Data compression methods
Quasi-Monte Carlo compression
Construction of digital nets
Error bounds
Implementation
Supercompress method
QMC-Voronoi method
Numerical results
Test functions
Neural networks
Conclusion
Declarations
Appendix

Key Result

Lemma 3.2

Let $\nu \geq 0$ be an integer. For all $\boldsymbol{a} \in \mathbb{N}_{0}^{s}$ the combination principle holds. Here, $\mathbbm{1}_{A}$ denotes the indicator function for an arbitrary set $A$.

Figures (6)

Figure 1: $(0, 4, 2)$-net in base $2$ (blue points)
Figure 2: $(0,4,2)$-net in base $2$ (blue points) with point set $\mathcal{X}$ (red crosses)
Figure 3: visualization of the precompressed data
Figure 4: confusion chart of the neural network without compression
Figure 5: confusion charts for different methods with a compression rate of $20 \%$
...and 1 more figures

Theorems & Definitions (5)

Definition 3.1: e.g., p.5, 1
Lemma 3.2: Lemma 1, 1
Definition 3.3: p.12, p.14, 1
Theorem 3.4: Corollary 12, 1
Theorem 3.5: Corollary 14, 1

Using Low-Discrepancy Points for Data Compression in Machine Learning: An Experimental Comparison

TL;DR

Abstract

Using Low-Discrepancy Points for Data Compression in Machine Learning: An Experimental Comparison

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (5)