Graph-Weighted Contrastive Learning for Semi-Supervised Hyperspectral Image Classification
Yuqing Zhang, Qi Han, Ligeng Wang, Kai Cheng, Bo Wang, Kun Zhan
TL;DR
This work tackles semi-supervised hyperspectral image (HSI) classification with limited labels by moving away from costly pixel- or superpixel-based graph constructions. It proposes Graph-Weighted Contrastive Learning (GWCL), which builds a pixel-level graph and optimizes a joint loss $L = L_{gwcl} + \lambda L_{ce}$, where $L_{gwcl} = \frac{1}{|\mathcal{P}|} \sum_{(i,j)\in\mathcal{P}} s_{ij} \| z_i - z_j \|^2$ and $f(z_i,z_j|\theta) = \exp(- s_{ij} \| z_i - z_j \|^2)$ with $s_{ij}$ derived from a graph on $({\bm x}_i,{\bm x}_j)$. Graph construction uses ARM to reduce spectral dimensionality to $\beta$ and forms $\bm x_i=[\bm h_i'; m; n]$, with $s_{ij}=\exp(-\tfrac{1}{2}(\bm x_i-\bm x_j)^T \Sigma^{-1}(\bm x_i-\bm x_j))$ for $K$-NN neighbors. GWCL supports mini-batch training by precomputing the graph and updating on node subsets, while the backbone is a two-layer MLP with 180 hidden units. Empirical results on Indian Pines, Salinas, and University of Pavia show GWCL's superiority over strong baselines, validating effective pixel-level learning and the use of unlabeled data in semi-supervised HSI classification.
Abstract
Most existing graph-based semi-supervised hyperspectral image classification methods rely on superpixel partitioning techniques. However, they suffer from misclassification of certain pixels due to inaccuracies in superpixel boundaries, \ie, the initial inaccuracies in superpixel partitioning limit overall classification performance. In this paper, we propose a novel graph-weighted contrastive learning approach that avoids the use of superpixel partitioning and directly employs neural networks to learn hyperspectral image representation. Furthermore, while many approaches require all graph nodes to be available during training, our approach supports mini-batch training by processing only a subset of nodes at a time, reducing computational complexity and improving generalization to unseen nodes. Experimental results on three widely-used datasets demonstrate the effectiveness of the proposed approach compared to baselines relying on superpixel partitioning.
