Table of Contents
Fetching ...

GLRT-Based Metric Learning for Remote Sensing Object Retrieval

Linping Zhang, Yu Liu, Xueqian Wang, Gang Li, You He

TL;DR

This work proposes a generalized likelihood ratio test-based metric learning (GLRTML) approach, which can estimate the relative difficulty of sample pairs by incorporating global data distribution information during training and test phases and proposes the clustering pseudo-labels-based fast parameter adaptation (CPLFPA) method, which efficiently estimates the distribution of embeddings in the target domain by clustering target domain instances and re-estimating the distribution parameters for GLRTML.

Abstract

With the improvement in the quantity and quality of remote sensing images, content-based remote sensing object retrieval (CBRSOR) has become an increasingly important topic. However, existing CBRSOR methods neglect the utilization of global statistical information during both training and test stages, which leads to the overfitting of neural networks to simple sample pairs of samples during training and suboptimal metric performance. Inspired by the Neyman-Pearson theorem, we propose a generalized likelihood ratio test-based metric learning (GLRTML) approach, which can estimate the relative difficulty of sample pairs by incorporating global data distribution information during training and test phases. This guides the network to focus more on difficult samples during the training process, thereby encourages the network to learn more discriminative feature embeddings. In addition, GLRT is a more effective than traditional metric space due to the utilization of global data distribution information. Accurately estimating the distribution of embeddings is critical for GLRTML. However, in real-world applications, there is often a distribution shift between the training and target domains, which diminishes the effectiveness of directly using the distribution estimated on training data. To address this issue, we propose the clustering pseudo-labels-based fast parameter adaptation (CPLFPA) method. CPLFPA efficiently estimates the distribution of embeddings in the target domain by clustering target domain instances and re-estimating the distribution parameters for GLRTML. We reorganize datasets for CBRSOR tasks based on fine-grained ship remote sensing image slices (FGSRSI-23) and military aircraft recognition (MAR20) datasets. Extensive experiments on these datasets demonstrate the effectiveness of our proposed GLRTML and CPLFPA.

GLRT-Based Metric Learning for Remote Sensing Object Retrieval

TL;DR

This work proposes a generalized likelihood ratio test-based metric learning (GLRTML) approach, which can estimate the relative difficulty of sample pairs by incorporating global data distribution information during training and test phases and proposes the clustering pseudo-labels-based fast parameter adaptation (CPLFPA) method, which efficiently estimates the distribution of embeddings in the target domain by clustering target domain instances and re-estimating the distribution parameters for GLRTML.

Abstract

With the improvement in the quantity and quality of remote sensing images, content-based remote sensing object retrieval (CBRSOR) has become an increasingly important topic. However, existing CBRSOR methods neglect the utilization of global statistical information during both training and test stages, which leads to the overfitting of neural networks to simple sample pairs of samples during training and suboptimal metric performance. Inspired by the Neyman-Pearson theorem, we propose a generalized likelihood ratio test-based metric learning (GLRTML) approach, which can estimate the relative difficulty of sample pairs by incorporating global data distribution information during training and test phases. This guides the network to focus more on difficult samples during the training process, thereby encourages the network to learn more discriminative feature embeddings. In addition, GLRT is a more effective than traditional metric space due to the utilization of global data distribution information. Accurately estimating the distribution of embeddings is critical for GLRTML. However, in real-world applications, there is often a distribution shift between the training and target domains, which diminishes the effectiveness of directly using the distribution estimated on training data. To address this issue, we propose the clustering pseudo-labels-based fast parameter adaptation (CPLFPA) method. CPLFPA efficiently estimates the distribution of embeddings in the target domain by clustering target domain instances and re-estimating the distribution parameters for GLRTML. We reorganize datasets for CBRSOR tasks based on fine-grained ship remote sensing image slices (FGSRSI-23) and military aircraft recognition (MAR20) datasets. Extensive experiments on these datasets demonstrate the effectiveness of our proposed GLRTML and CPLFPA.
Paper Structure (19 sections, 50 equations, 6 figures, 12 tables, 3 algorithms)

This paper contains 19 sections, 50 equations, 6 figures, 12 tables, 3 algorithms.

Figures (6)

  • Figure 1: Disadvantages of the existing CBRSOR methods. (a) is the visualization of the distance between positive and negative sample pairs on CBRSOR-FGSRSI dataset. The graph on the left of (a) is the result of a cosine similarity-based measure, the higher the similarity of positive samples and the lower the similarity of negative samples, the better. The graph on the right of (a) is the result of our proposed GLRT-based metric learning, where the smaller the distance between positive sample pairs and the larger the distance between negative sample pairs, the better. (b) shows existing methods build loss functions based on sampling data, where simple samples dominate. This phenomenon causes neural networks to only learn the local relationships of the dataset and easily overfit to simple samples.
  • Figure 2: The flow chart of proposed GLRTML.
  • Figure 3: The flow chart of proposed CPLFPA.
  • Figure 4: Performance of the proposed method under different number of Gaussian component settings on CBRSOR-FGSRSI. Setting 1: we set $K_0 = N_g$, $K_1 = 1$ in GLRTML, and use CPLFPA. Setting 2: we set $K_0 = N_g$, $K_1 = N_g$ in GLRTML, and use CPLFPA. Setting 3: we set $K_0 = N_g$, $K_1 = 1$ in GLRTML, and cancel CPLFPA. Setting 4: we set $K_0 = N_g$, $K_1 = N_g$ in GLRTML, constrain the covariance matrix of each Gaussian component of the GMM to be a diagonal matrix, and use CPLFPA.
  • Figure 5: Visualization results of the statistical model fitting capabilities of MG-GLRTML and GMM-GLRTML ($K_1 = 1$, $K_0 = 10$). (a) MG-GLRTML visualization results on the training set. (b) MG-GLRTML visualization results on the test set. (c) GMM-GLRTML visualization results on the training set. (d) GMM-GLRTML visualization results on the test set. The blue points in the figure represent the embeddings of the real positive samples, visualized using t-SNE ref64. The red points represent real negative sample embeddings, visualized using t-SNE ref64. The cyan points represent the fitted embeddings of positive samples by the statistical model, which are obtained through Monte Carlo simulation ref65 based on the statistical model and then visualized using t-SNE ref64. The pink points represent the fitted embeddings of the negative samples by the statistical model, which are obtained through Monte Carlo simulation ref65 based on the statistical model and then visualized using t-SNE ref64.
  • ...and 1 more figures