Table of Contents
Fetching ...

Pseudo-label Refinement for Improving Self-Supervised Learning Systems

Zia-ur-Rehman, Arif Mahmood, Wenxiong Kang

TL;DR

Experimental results demonstrate that the modified Re-ID baseline, incorporating the SLR algorithm, achieves significantly improved mean Average Precision performance in various UDA tasks, including real-to-synthetic, synthetic-to-real, and different real-to-real scenarios.

Abstract

Self-supervised learning systems have gained significant attention in recent years by leveraging clustering-based pseudo-labels to provide supervision without the need for human annotations. However, the noise in these pseudo-labels caused by the clustering methods poses a challenge to the learning process leading to degraded performance. In this work, we propose a pseudo-label refinement (SLR) algorithm to address this issue. The cluster labels from the previous epoch are projected to the current epoch cluster-labels space and a linear combination of the new label and the projected label is computed as a soft refined label containing the information from the previous epoch clusters as well as from the current epoch. In contrast to the common practice of using the maximum value as a cluster/class indicator, we employ hierarchical clustering on these soft pseudo-labels to generate refined hard-labels. This approach better utilizes the information embedded in the soft labels, outperforming the simple maximum value approach for hard label generation. The effectiveness of the proposed SLR algorithm is evaluated in the context of person re-identification (Re-ID) using unsupervised domain adaptation (UDA). Experimental results demonstrate that the modified Re-ID baseline, incorporating the SLR algorithm, achieves significantly improved mean Average Precision (mAP) performance in various UDA tasks, including real-to-synthetic, synthetic-to-real, and different real-to-real scenarios. These findings highlight the efficacy of the SLR algorithm in enhancing the performance of self-supervised learning systems.

Pseudo-label Refinement for Improving Self-Supervised Learning Systems

TL;DR

Experimental results demonstrate that the modified Re-ID baseline, incorporating the SLR algorithm, achieves significantly improved mean Average Precision performance in various UDA tasks, including real-to-synthetic, synthetic-to-real, and different real-to-real scenarios.

Abstract

Self-supervised learning systems have gained significant attention in recent years by leveraging clustering-based pseudo-labels to provide supervision without the need for human annotations. However, the noise in these pseudo-labels caused by the clustering methods poses a challenge to the learning process leading to degraded performance. In this work, we propose a pseudo-label refinement (SLR) algorithm to address this issue. The cluster labels from the previous epoch are projected to the current epoch cluster-labels space and a linear combination of the new label and the projected label is computed as a soft refined label containing the information from the previous epoch clusters as well as from the current epoch. In contrast to the common practice of using the maximum value as a cluster/class indicator, we employ hierarchical clustering on these soft pseudo-labels to generate refined hard-labels. This approach better utilizes the information embedded in the soft labels, outperforming the simple maximum value approach for hard label generation. The effectiveness of the proposed SLR algorithm is evaluated in the context of person re-identification (Re-ID) using unsupervised domain adaptation (UDA). Experimental results demonstrate that the modified Re-ID baseline, incorporating the SLR algorithm, achieves significantly improved mean Average Precision (mAP) performance in various UDA tasks, including real-to-synthetic, synthetic-to-real, and different real-to-real scenarios. These findings highlight the efficacy of the SLR algorithm in enhancing the performance of self-supervised learning systems.

Paper Structure

This paper contains 24 sections, 9 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: An example of the proposed Pseudo Label Refinement (SLR) algorithm: in the previous epoch, three clusters were generated while in the current epoch, four clusters are obtained. Cluster labels are represented using the one-hot encoding scheme. The previous label of an instance is projected to the space of the current labels using a projection matrix. A linear combination of the projected label and the current label is used to get refined soft labels, which are then hierarchically clustered to get refined pseudo labels.
  • Figure 2: System diagram of modified baseline with proposed Pseudo Label Refinement (SLR) algorithm: (a) Teacher network, (b) Student network, (c) Diverse feature layer, (d) The SLR algorithm starts here: clustering diverse features from teacher network, (e) Projection matrix estimation between consecutive epochs, (f) Projected pseudo-labels from previous epoch to the current epoch label space, (g) Refined soft labels for the current epoch, (h) The SLR algorithm ends here: hierarchical clustering for hard labels generation, (i) Teacher dynamic classifier trained on hard labels, (j) Student dynamic classifier trained on hard labels and soft labels from teacher network using cross-branch supervision. (k) The updated student network weights are used to update the teacher network weights using Exponential Moving Average (EMA) approach.
  • Figure 3: Performance comparison of SLR algorithm in terms of mAP on 40 and 50 epochs with varying min cluster size parameter.
  • Figure 4: Performance comparison of SLR algorithm in terms of rank 1 accuracy on 40 and 50 epochs. The best result for rank 1 is 93.7% on 50 epochs and a minimum cluster size of 7.