Embedding Compression for Efficient Re-Identification

Luke McDermott

Embedding Compression for Efficient Re-Identification

Luke McDermott

TL;DR

Re-identification embeddings pose a storage bottleneck as datasets scale, motivating explicit embedding compression. The authors benchmark four dimension-reduction strategies—slicing, low-rank embeddings, iterative structured pruning—and quantization-aware training on ViT-backed ReID models across three datasets, achieving up to 96x compression with only about 4% drop in Market-1501. They find that the high-dimensional latent space is underutilized, with slicing and low-rank approaches often outperforming full-size embeddings at high compression, while quantization offers additional gains in certain regimes. The work suggests rethinking default embedding dimensionality and regularization to better exploit latent space, and calls for broader benchmarks and cross-domain evaluation to guide practical deployment on edge/offline systems.

Abstract

Real world re-identfication (ReID) algorithms aim to map new observations of an object to previously recorded instances. These systems are often constrained by quantity and size of the stored embeddings. To combat this scaling problem, we attempt to shrink the size of these vectors by using a variety of compression techniques. In this paper, we benchmark quantization-aware-training along with three different dimension reduction methods: iterative structured pruning, slicing the embeddings at initialize, and using low rank embeddings. We find that ReID embeddings can be compressed by up to 96x with minimal drop in performance. This implies that modern re-identification paradigms do not fully leverage the high dimensional latent space, opening up further research to increase the capabilities of these systems.

Embedding Compression for Efficient Re-Identification

TL;DR

Abstract

Paper Structure (12 sections, 5 figures)

This paper contains 12 sections, 5 figures.

Introduction
Background
Methods
Slicing.
Low Rank Embeddings.
Iterative Structured Pruning.
Quantization.
Results
Market-1501
PRW
PRAI
Discussion

Figures (5)

Figure 1: Re-identification Training Paradigm
Figure 2: Dimension Reduction Methods
Figure 3: Performance of each compression method across different compression ratios on Market-1501. The original embedding dimension is 768. Higher mAP is better performance. Smaller embedding dimension is more efficient.
Figure 4: Performance of each compression method across different compression ratios on PRW.
Figure 5: Performance of each compression method across different compression ratios on PRAI.

Embedding Compression for Efficient Re-Identification

TL;DR

Abstract

Embedding Compression for Efficient Re-Identification

Authors

TL;DR

Abstract

Table of Contents

Figures (5)