Table of Contents
Fetching ...

VehicleGAN: Pair-flexible Pose Guided Image Synthesis for Vehicle Re-identification

Baolu Li, Ping Liu, Lan Fu, Jinlong Li, Jianwu Fang, Zhigang Xu, Hongkai Yu

TL;DR

The paper tackles pose variation in vehicle Re-ID by introducing VehicleGAN, a pose-guided image synthesis framework that operates without 3D models and supports both supervised and unsupervised training through AutoReconstruction. A Joint Metric Learning (JML) strategy then fuses real and synthetic features, training MR and MS branches to produce pose-invariant representations. Key contributions include the Pair-flexible VehicleGAN with AutoReconstruction, a trust-region based unsupervised training regime, and a Unified Target Pose for consistent synthesis, all showing strong gains on VeRi-776 and VehicleID. The approach enables improved discriminability in cross-view Re-ID and offers a practical pipeline for leveraging synthetic data in real-world surveillance scenarios.

Abstract

Vehicle Re-identification (Re-ID) has been broadly studied in the last decade; however, the different camera view angle leading to confused discrimination in the feature subspace for the vehicles of various poses, is still challenging for the Vehicle Re-ID models in the real world. To promote the Vehicle Re-ID models, this paper proposes to synthesize a large number of vehicle images in the target pose, whose idea is to project the vehicles of diverse poses into the unified target pose so as to enhance feature discrimination. Considering that the paired data of the same vehicles in different traffic surveillance cameras might be not available in the real world, we propose the first Pair-flexible Pose Guided Image Synthesis method for Vehicle Re-ID, named as VehicleGAN in this paper, which works for both supervised and unsupervised settings without the knowledge of geometric 3D models. Because of the feature distribution difference between real and synthetic data, simply training a traditional metric learning based Re-ID model with data-level fusion (i.e., data augmentation) is not satisfactory, therefore we propose a new Joint Metric Learning (JML) via effective feature-level fusion from both real and synthetic data. Intensive experimental results on the public VeRi-776 and VehicleID datasets prove the accuracy and effectiveness of our proposed VehicleGAN and JML.

VehicleGAN: Pair-flexible Pose Guided Image Synthesis for Vehicle Re-identification

TL;DR

The paper tackles pose variation in vehicle Re-ID by introducing VehicleGAN, a pose-guided image synthesis framework that operates without 3D models and supports both supervised and unsupervised training through AutoReconstruction. A Joint Metric Learning (JML) strategy then fuses real and synthetic features, training MR and MS branches to produce pose-invariant representations. Key contributions include the Pair-flexible VehicleGAN with AutoReconstruction, a trust-region based unsupervised training regime, and a Unified Target Pose for consistent synthesis, all showing strong gains on VeRi-776 and VehicleID. The approach enables improved discriminability in cross-view Re-ID and offers a practical pipeline for leveraging synthetic data in real-world surveillance scenarios.

Abstract

Vehicle Re-identification (Re-ID) has been broadly studied in the last decade; however, the different camera view angle leading to confused discrimination in the feature subspace for the vehicles of various poses, is still challenging for the Vehicle Re-ID models in the real world. To promote the Vehicle Re-ID models, this paper proposes to synthesize a large number of vehicle images in the target pose, whose idea is to project the vehicles of diverse poses into the unified target pose so as to enhance feature discrimination. Considering that the paired data of the same vehicles in different traffic surveillance cameras might be not available in the real world, we propose the first Pair-flexible Pose Guided Image Synthesis method for Vehicle Re-ID, named as VehicleGAN in this paper, which works for both supervised and unsupervised settings without the knowledge of geometric 3D models. Because of the feature distribution difference between real and synthetic data, simply training a traditional metric learning based Re-ID model with data-level fusion (i.e., data augmentation) is not satisfactory, therefore we propose a new Joint Metric Learning (JML) via effective feature-level fusion from both real and synthetic data. Intensive experimental results on the public VeRi-776 and VehicleID datasets prove the accuracy and effectiveness of our proposed VehicleGAN and JML.
Paper Structure (27 sections, 15 equations, 3 figures, 3 tables)

This paper contains 27 sections, 15 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Enhanced discrimination by synthesizing vehicle images to a unified target pose by our VehicleGAN. (a) Vehicle images in the VeRi-776 liu2016deep1. (b) Corresponding synthesized vehicle images by VehicleGAN.
  • Figure 2: The pipeline of the proposed Joint Metric Learning using the proposed VehicleGAN.$\mathrm{M_R}$: Re-ID model for real images. $\mathrm{M_S}$: Re-ID model for synthetic images. The Unified Pose part represents the synthetic images with the same target pose by the proposed VehicleGAN (pre-trained). The real and synthetic features are fused for joint training and testing.
  • Figure 3: The detailed pipeline of the proposed VehicleGAN with the idea of AutoReconstruction. The input of the generator is the channel-wise concatenation of the original image and the target pose, and the output of the generator is the synthesized image of the original image in the target pose. Unsupervised setting (unpaired data) is shown in this example.