Table of Contents
Fetching ...

One-Shot Knowledge Transfer for Scalable Person Re-Identification

Longhua Li, Lei Qi, Xin Geng

TL;DR

This paper addresses the need for resource-adaptive person Re-Identification models suitable for edge devices by proposing One-Shot Knowledge Transfer (OSKT). OSKT compresses teacher knowledge into a width-reduced weight chain that preserves depth and can be expanded to any target width without extra training, enabling rapid generation of multiple models. It formalizes a unified CNN/ViT representation, introduces row clustering and progressive refinement to optimize the weight chain, and demonstrates strong cross- and intra-scenario performance, including compatibility with lightweight ReID architectures. The approach significantly reduces computational overhead for model provisioning, achieves state-of-the-art results on standard benchmarks, and offers practical benefits for deploying scalable, privacy-conscious ReID systems on edge devices.

Abstract

Edge computing in person re-identification (ReID) is crucial for reducing the load on central cloud servers and ensuring user privacy. Conventional compression methods for obtaining compact models require computations for each individual student model. When multiple models of varying sizes are needed to accommodate different resource conditions, this leads to repetitive and cumbersome computations. To address this challenge, we propose a novel knowledge inheritance approach named OSKT (One-Shot Knowledge Transfer), which consolidates the knowledge of the teacher model into an intermediate carrier called a weight chain. When a downstream scenario demands a model that meets specific resource constraints, this weight chain can be expanded to the target model size without additional computation. OSKT significantly outperforms state-of-the-art compression methods, with the added advantage of one-time knowledge transfer that eliminates the need for frequent computations for each target model.

One-Shot Knowledge Transfer for Scalable Person Re-Identification

TL;DR

This paper addresses the need for resource-adaptive person Re-Identification models suitable for edge devices by proposing One-Shot Knowledge Transfer (OSKT). OSKT compresses teacher knowledge into a width-reduced weight chain that preserves depth and can be expanded to any target width without extra training, enabling rapid generation of multiple models. It formalizes a unified CNN/ViT representation, introduces row clustering and progressive refinement to optimize the weight chain, and demonstrates strong cross- and intra-scenario performance, including compatibility with lightweight ReID architectures. The approach significantly reduces computational overhead for model provisioning, achieves state-of-the-art results on standard benchmarks, and offers practical benefits for deploying scalable, privacy-conscious ReID systems on edge devices.

Abstract

Edge computing in person re-identification (ReID) is crucial for reducing the load on central cloud servers and ensuring user privacy. Conventional compression methods for obtaining compact models require computations for each individual student model. When multiple models of varying sizes are needed to accommodate different resource conditions, this leads to repetitive and cumbersome computations. To address this challenge, we propose a novel knowledge inheritance approach named OSKT (One-Shot Knowledge Transfer), which consolidates the knowledge of the teacher model into an intermediate carrier called a weight chain. When a downstream scenario demands a model that meets specific resource constraints, this weight chain can be expanded to the target model size without additional computation. OSKT significantly outperforms state-of-the-art compression methods, with the added advantage of one-time knowledge transfer that eliminates the need for frequent computations for each target model.

Paper Structure

This paper contains 23 sections, 4 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: (a) Transfer knowledge from a pre-trained ResNet50 to student models of different sizes. (b) Refine knowledge into a weight chain in a single pass, enabling adaptive expansion of models of varying sizes meeting downstream resource constraints.
  • Figure 2: (a) In CNNs, a filter is abstracted as a row, while a channel in the weight matrix is abstracted as a column. (b) In ViTs, the weight matrix is also abstracted into rows and columns.
  • Figure 3: (a) Achieving identity transformation by merging identical rows and summing corresponding columns in the next layer. (b) Extracting lightweight principal features by averaging the normalization layer’s affine transformation parameters.
  • Figure 4: An overview framework of the proposed method. (a) Cluster the weight rows within each layer of the teacher model, using cluster centers as initial weight chain rows. (b) Train the teacher model and the S-Student constrained by the weight chain, using Refining loss and ReID loss to infuse core knowledge into the weight chain. (c) Reuse and stack the refined rows proportionally to achieve the required layer width for the student model. Average the $(\gamma,\beta)$ pairs within each normalization layer and sum the column weights of each subsequent layer accordingly to generate the student model. This is an O(1) operation that does not require additional computation.
  • Figure 5: Convergence curves of student models generated by our OSKT, compared to other approaches across diverse scenarios.
  • ...and 3 more figures