Efficient Verification-Based Face Identification
Amit Rozner, Barak Battash, Ofir Lindenbaum, Lior Wolf
TL;DR
The paper tackles efficient face verification on edge devices by replacing a large embedding-based system with a per-identity binary verifier generated by a hypernetwork $h$, which outputs weights $\theta^i$ for a compact on-device model $f(\theta^i)$. Enrollment produces $\theta^i$ from a single image using a frozen backbone $h_{bb}$ and a trainable generator $h_{gen}$, while inference uses only the small edge model, discarding $h$ to minimize computation. A novel training regime combines weighted binary cross-entropy with a norm penalty and introduces K-means Centered Sampling (KCS) to create hard-negative batches, yielding a model with $23{,}000$ parameters and $5\times 10^{6}$ FLOPS that remains competitive across six datasets. The approach demonstrates that re-framing face verification as a personalized, edge-friendly task can dramatically reduce memory and compute requirements while preserving performance, and offers a path to extending the idea to other domains.
Abstract
We study the problem of performing face verification with an efficient neural model $f$. The efficiency of $f$ stems from simplifying the face verification problem from an embedding nearest neighbor search into a binary problem; each user has its own neural network $f$. To allow information sharing between different individuals in the training set, we do not train $f$ directly but instead generate the model weights using a hypernetwork $h$. This leads to the generation of a compact personalized model for face identification that can be deployed on edge devices. Key to the method's success is a novel way of generating hard negatives and carefully scheduling the training objectives. Our model leads to a substantially small $f$ requiring only 23k parameters and 5M floating point operations (FLOPS). We use six face verification datasets to demonstrate that our method is on par or better than state-of-the-art models, with a significantly reduced number of parameters and computational burden. Furthermore, we perform an extensive ablation study to demonstrate the importance of each element in our method.
