Multi-level Graph Subspace Contrastive Learning for Hyperspectral Image Clustering
Jingxin Wang, Renxiang Guan, Kainan Gao, Zihao Li, Hao Li, Xianju Li, Chang Tang
TL;DR
This work tackles hyperspectral image clustering by addressing the lack of global-local interaction modeling in prior methods. It introduces Multi-level Graph Subspace Contrastive Learning (MLGSC), which builds dual feature views (spectral-spatial and texture), applies attention pooling to obtain a robust global graph representation, and leverages node-level and graph-level contrastive losses to fuse local and global information across views. A self-expression-based affinity learning stage further refines the clustering, culminating in spectral clustering on the learned affinity matrix $W = \tfrac{1}{2} (|C| + |C|^T)$. Empirically, MLGSC achieves state-of-the-art OA/NMI/Kappa on Indian Pines, Pavia University, Houston-2013, and Xu Zhou, validating its effectiveness and robustness for unsupervised HSI clustering with practical impact for large-scale remote sensing analytics.
Abstract
Hyperspectral image (HSI) clustering is a challenging task due to its high complexity. Despite subspace clustering shows impressive performance for HSI, traditional methods tend to ignore the global-local interaction in HSI data. In this study, we proposed a multi-level graph subspace contrastive learning (MLGSC) for HSI clustering. The model is divided into the following main parts. Graph convolution subspace construction: utilizing spectral and texture feautures to construct two graph convolution views. Local-global graph representation: local graph representations were obtained by step-by-step convolutions and a more representative global graph representation was obtained using an attention-based pooling strategy. Multi-level graph subspace contrastive learning: multi-level contrastive learning was conducted to obtain local-global joint graph representations, to improve the consistency of the positive samples between views, and to obtain more robust graph embeddings. Specifically, graph-level contrastive learning is used to better learn global representations of HSI data. Node-level intra-view and inter-view contrastive learning is designed to learn joint representations of local regions of HSI. The proposed model is evaluated on four popular HSI datasets: Indian Pines, Pavia University, Houston, and Xu Zhou. The overall accuracies are 97.75%, 99.96%, 92.28%, and 95.73%, which significantly outperforms the current state-of-the-art clustering methods.
