Table of Contents
Fetching ...

SpectralTrain: A Universal Framework for Hyperspectral Image Classification

Meihua Zhou, Liping Yu, Jiawei Cai, Wai Kin Fung, Ruiguo Hu, Jiarui Zhao, Wenzhuo Liu, Nan Wan

TL;DR

SpectralTrain addresses the efficiency bottleneck in hyperspectral image classification by introducing a spectral curriculum that starts with PCA-based spectral downsampling and progressively restores full spectral information. The framework is architecture-agnostic and couples a compute-budgeted training schedule with CPU-based spectral compression, achieving 2–7× speedups with minimal accuracy loss across Indian Pines, Salinas-A, and CloudPatch-7. Key contributions include a universal training paradigm compatible with CNNs, 3D spectral networks, and transformers, a formal cost- and stage-switching model, and extensive experiments demonstrating robustness across spatial scales, spectral characteristics, and climate-related cloud classification. The work highlights training strategy optimization as a powerful complement to architectural design in hyperspectral learning, with code released at the provided GitHub repository.

Abstract

Hyperspectral image (HSI) classification typically involves large-scale data and computationally intensive training, which limits the practical deployment of deep learning models in real-world remote sensing tasks. This study introduces SpectralTrain, a universal, architecture-agnostic training framework that enhances learning efficiency by integrating curriculum learning (CL) with principal component analysis (PCA)-based spectral downsampling. By gradually introducing spectral complexity while preserving essential information, SpectralTrain enables efficient learning of spectral -- spatial patterns at significantly reduced computational costs. The framework is independent of specific architectures, optimizers, or loss functions and is compatible with both classical and state-of-the-art (SOTA) models. Extensive experiments on three benchmark datasets -- Indian Pines, Salinas-A, and the newly introduced CloudPatch-7 -- demonstrate strong generalization across spatial scales, spectral characteristics, and application domains. The results indicate consistent reductions in training time by 2-7x speedups with small-to-moderate accuracy deltas depending on backbone. Its application to cloud classification further reveals potential in climate-related remote sensing, emphasizing training strategy optimization as an effective complement to architectural design in HSI models. Code is available at https://github.com/mh-zhou/SpectralTrain.

SpectralTrain: A Universal Framework for Hyperspectral Image Classification

TL;DR

SpectralTrain addresses the efficiency bottleneck in hyperspectral image classification by introducing a spectral curriculum that starts with PCA-based spectral downsampling and progressively restores full spectral information. The framework is architecture-agnostic and couples a compute-budgeted training schedule with CPU-based spectral compression, achieving 2–7× speedups with minimal accuracy loss across Indian Pines, Salinas-A, and CloudPatch-7. Key contributions include a universal training paradigm compatible with CNNs, 3D spectral networks, and transformers, a formal cost- and stage-switching model, and extensive experiments demonstrating robustness across spatial scales, spectral characteristics, and climate-related cloud classification. The work highlights training strategy optimization as a powerful complement to architectural design in hyperspectral learning, with code released at the provided GitHub repository.

Abstract

Hyperspectral image (HSI) classification typically involves large-scale data and computationally intensive training, which limits the practical deployment of deep learning models in real-world remote sensing tasks. This study introduces SpectralTrain, a universal, architecture-agnostic training framework that enhances learning efficiency by integrating curriculum learning (CL) with principal component analysis (PCA)-based spectral downsampling. By gradually introducing spectral complexity while preserving essential information, SpectralTrain enables efficient learning of spectral -- spatial patterns at significantly reduced computational costs. The framework is independent of specific architectures, optimizers, or loss functions and is compatible with both classical and state-of-the-art (SOTA) models. Extensive experiments on three benchmark datasets -- Indian Pines, Salinas-A, and the newly introduced CloudPatch-7 -- demonstrate strong generalization across spatial scales, spectral characteristics, and application domains. The results indicate consistent reductions in training time by 2-7x speedups with small-to-moderate accuracy deltas depending on backbone. Its application to cloud classification further reveals potential in climate-related remote sensing, emphasizing training strategy optimization as an effective complement to architectural design in HSI models. Code is available at https://github.com/mh-zhou/SpectralTrain.

Paper Structure

This paper contains 54 sections, 5 theorems, 31 equations, 8 figures, 8 tables, 1 algorithm.

Key Result

Proposition 1

Let $\mathcal{T}_{H}:f\mapsto f*h$. Since $|H|\le 1$, $\|\,\mathcal{T}_{H}f\,\|_{L^{2}}\le\|f\|_{L^{2}}$ and $\|\mathcal{T}_{H}f-\mathcal{T}_{H}g\|_{L^{2}}\le\|f-g\|_{L^{2}}$. Thus pairwise separations in $L^{2}$ contract band-wise under LFC.

Figures (8)

  • Figure 1: Overview of SpectralTrain. (a) Discrete decision-making based on RGB sample difficulty (e.g., selecting easier low-frequency content first). (b) Hyperspectral imaging-based curriculum (ours): a continuous transformation $T_t(\cdot)$ progressively introduces complex spectral and spatial patterns via PCA-based spectral reduction and image-level downsampling, gradually increasing data complexity as training proceeds.
  • Figure 2: Motivation example for a spectral curriculum on Indian Pines. (A) Average spectra of two clusters show band-localized discriminative cues, so uniform band dropping risks removing signals. (B) PCA cumulative explained variance illustrates that a small number of components preserve most energy, motivating an information-preserving low-cost start; PCA is used here as a representative linear compressor. (C) Per-epoch transfer and compute scale with the number of retained components $k$, so beginning with small $k$ saves early cost. (D) Curriculum stages progressively increase both spectral components and spatial size (PC1/PC2/PC3 composites). Panels illustrate one instantiation with PCA; Table \ref{['tab:dr']} shows that alternative reducers (UMAP/ICA) produce comparable behavior under the same staged schedule.
  • Figure 3: OA/AA across backbone scales under SpectralTrain.
  • Figure 4: Kappa across ResNet and ConvNeXt variants under SpectralTrain.
  • Figure 5: Supplementary Figure S1. Class distribution in CloudPatch-7. Bars show per-class sample counts (c01–c07), revealing moderate imbalance.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Proposition 1: Contraction under LFC
  • Proposition 2: Irrecoverable high-frequency energy loss
  • Theorem 3: Eckart--Young--Mirsky
  • Proposition 4: Nyquist-consistent sampling per PC
  • Proposition 5: Separation versus approximation