Table of Contents
Fetching ...

HOpenCls: Training Hyperspectral Image Open-Set Classifiers in Their Living Environments

Hengwei Zhao, Xinyu Wang, Zhuo Zheng, Jingtao Li, Yanfei Zhong

TL;DR

This work tackles open-set hyperspectral image classification in real-world deployments by leveraging unlabeled wild data that naturally contains both known and unknown classes. It reframes rejection of unknowns as a positive-unlabeled (PU) learning problem and introduces a multi-PU head to decompose it into class-wise sub-tasks, augmented by gradient-based Grad-C and Grad-E modules to manage abnormal gradient weights arising from wild data. The approach is theoretically motivated by Taylor-based TBCE analysis and empirically validated on WHU-Hi and University of Pavia datasets, achieving superior Open OA and unknown-rejection metrics (F1u) while maintaining known-class performance. The results indicate that wild data can significantly enhance open-set HSI classification in complex real-world settings, enabling safer and more reliable deployment without labor-intensive unknown-class annotations.

Abstract

Hyperspectral image (HSI) open-set classification is critical for HSI classification models deployed in real-world environments, where classifiers must simultaneously classify known classes and reject unknown classes. Recent methods utilize auxiliary unknown classes data to improve classification performance. However, the auxiliary unknown classes data is strongly assumed to be completely separable from known classes and requires labor-intensive annotation. To address this limitation, this paper proposes a novel framework, HOpenCls, to leverage the unlabeled wild data-that is the mixture of known and unknown classes. Such wild data is abundant and can be collected freely during deploying classifiers in their living environments. The key insight is reformulating the open-set HSI classification with unlabeled wild data as a positive-unlabeled (PU) learning problem. Specifically, the multi-label strategy is introduced to bridge the PU learning and open-set HSI classification, and then the proposed gradient contraction and gradient expansion module to make this PU learning problem tractable from the observation of abnormal gradient weights associated with wild data. Extensive experiment results demonstrate that incorporating wild data has the potential to significantly enhance open-set HSI classification in complex real-world scenarios.

HOpenCls: Training Hyperspectral Image Open-Set Classifiers in Their Living Environments

TL;DR

This work tackles open-set hyperspectral image classification in real-world deployments by leveraging unlabeled wild data that naturally contains both known and unknown classes. It reframes rejection of unknowns as a positive-unlabeled (PU) learning problem and introduces a multi-PU head to decompose it into class-wise sub-tasks, augmented by gradient-based Grad-C and Grad-E modules to manage abnormal gradient weights arising from wild data. The approach is theoretically motivated by Taylor-based TBCE analysis and empirically validated on WHU-Hi and University of Pavia datasets, achieving superior Open OA and unknown-rejection metrics (F1u) while maintaining known-class performance. The results indicate that wild data can significantly enhance open-set HSI classification in complex real-world settings, enabling safer and more reliable deployment without labor-intensive unknown-class annotations.

Abstract

Hyperspectral image (HSI) open-set classification is critical for HSI classification models deployed in real-world environments, where classifiers must simultaneously classify known classes and reject unknown classes. Recent methods utilize auxiliary unknown classes data to improve classification performance. However, the auxiliary unknown classes data is strongly assumed to be completely separable from known classes and requires labor-intensive annotation. To address this limitation, this paper proposes a novel framework, HOpenCls, to leverage the unlabeled wild data-that is the mixture of known and unknown classes. Such wild data is abundant and can be collected freely during deploying classifiers in their living environments. The key insight is reformulating the open-set HSI classification with unlabeled wild data as a positive-unlabeled (PU) learning problem. Specifically, the multi-label strategy is introduced to bridge the PU learning and open-set HSI classification, and then the proposed gradient contraction and gradient expansion module to make this PU learning problem tractable from the observation of abnormal gradient weights associated with wild data. Extensive experiment results demonstrate that incorporating wild data has the potential to significantly enhance open-set HSI classification in complex real-world scenarios.

Paper Structure

This paper contains 15 sections, 40 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Comparison of classification results between closed-set based classifier and open-set based classifier for the University of Pavia dataset. The dataset originally contains nine known land cover classes, however, significant misclassifications occur in the unknown classes in closed-set based results. For instance, these unknown buildings with red roofs are misclassified as Bare S., Meadows, and other known materials by closed-set based classifier FPGA. Note that there is a significant overlap in the distribution of spectral curves between known and unknown classes in HSI datasets, which poses a major problem to open-set HSI classification.
  • Figure 2: The proposed HOpenCls framework. This framework can effectively leverage the wild data for open-set HSI classification. This framework includes multi-PU head, gradient contraction (Grad-C) and gradient expansion (Grad-E) PU learning algorithm. The PU learning component handles the rejection of unknown classes, while the classification of known classes is performed using an existing multi-class HSI classifier.
  • Figure 3: Comparison of different modules against to wild known data. (a) The negative impact of unknown classes data replaced by wild data stems from the larger gradient weights associated with wild known data. (b) The Grad-C module reduces the gradient weights associated with both wild known and unknown data. (c) The Grad-E module restores the gradient weights for the wild unknown data by weighting mechanism.
  • Figure 4: The WHU-Hi series HSI datasets: WHU-Hi-HongHu, WHU-Hi-LongKou, and WHU-Hi-HanChuan.
  • Figure 5: Open-set classification maps of WHU-Hi-HongHu dataset.
  • ...and 4 more figures