Embedding Compression for Efficient Re-Identification
Luke McDermott
TL;DR
Re-identification embeddings pose a storage bottleneck as datasets scale, motivating explicit embedding compression. The authors benchmark four dimension-reduction strategies—slicing, low-rank embeddings, iterative structured pruning—and quantization-aware training on ViT-backed ReID models across three datasets, achieving up to 96x compression with only about 4% drop in Market-1501. They find that the high-dimensional latent space is underutilized, with slicing and low-rank approaches often outperforming full-size embeddings at high compression, while quantization offers additional gains in certain regimes. The work suggests rethinking default embedding dimensionality and regularization to better exploit latent space, and calls for broader benchmarks and cross-domain evaluation to guide practical deployment on edge/offline systems.
Abstract
Real world re-identfication (ReID) algorithms aim to map new observations of an object to previously recorded instances. These systems are often constrained by quantity and size of the stored embeddings. To combat this scaling problem, we attempt to shrink the size of these vectors by using a variety of compression techniques. In this paper, we benchmark quantization-aware-training along with three different dimension reduction methods: iterative structured pruning, slicing the embeddings at initialize, and using low rank embeddings. We find that ReID embeddings can be compressed by up to 96x with minimal drop in performance. This implies that modern re-identification paradigms do not fully leverage the high dimensional latent space, opening up further research to increase the capabilities of these systems.
