Table of Contents
Fetching ...

HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis

Fangqin Zhou, Mert Kilickaya, Joaquin Vanschoren, Ran Piao

TL;DR

HyTAS introduces the first hyperspectral transformer architecture search benchmark, enabling systematic evaluation of training-free proxies for HSI classification. The authors construct a search space of 2000 hyperspectral transformer subnetworks and benchmark 12 proxies across 5 datasets, demonstrating that proxies can often identify architectures that surpass a human-crafted baseline while tending toward larger models. They propose ZiCo++ as an enhanced proxy with superior correlation to true performance and analyze factors influencing both model performance and proxy scores. Additionally, HyTAS shows proxies can complement other search methods, notably by enabling predictive modeling (e.g., Random Forest) to forecast network performance with low training cost, guiding more efficient TAS in hyperspectral domains.

Abstract

Hyperspectral Imaging (HSI) plays an increasingly critical role in precise vision tasks within remote sensing, capturing a wide spectrum of visual data. Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction. All benchmark materials are available at HyTAS.

HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis

TL;DR

HyTAS introduces the first hyperspectral transformer architecture search benchmark, enabling systematic evaluation of training-free proxies for HSI classification. The authors construct a search space of 2000 hyperspectral transformer subnetworks and benchmark 12 proxies across 5 datasets, demonstrating that proxies can often identify architectures that surpass a human-crafted baseline while tending toward larger models. They propose ZiCo++ as an enhanced proxy with superior correlation to true performance and analyze factors influencing both model performance and proxy scores. Additionally, HyTAS shows proxies can complement other search methods, notably by enabling predictive modeling (e.g., Random Forest) to forecast network performance with low training cost, guiding more efficient TAS in hyperspectral domains.

Abstract

Hyperspectral Imaging (HSI) plays an increasingly critical role in precise vision tasks within remote sensing, capturing a wide spectrum of visual data. Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction. All benchmark materials are available at HyTAS.
Paper Structure (25 sections, 3 equations, 8 figures, 5 tables)

This paper contains 25 sections, 3 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: A scientist outside AI domain may find it difficult to design a novel Hyperspectral image transformer tailored to their tasks and data. To automate the design of Hyperspectral transformers, we introduce Hyperspectral Transformer Architecture Search (HyTAS) in this paper. The scientist images are generated by DALL-E.
  • Figure 2: Diagram of TAS proxies for HSI classification: (1) Patchify the image and tokenize each patch along its spectra. (2) Randomly sample architectures from a designed search space. (3) Select a proxy to compute scores for sampled architectures after passing a batch of input. (4) Use the scores to rank architectures and choose the one with the highest score to retrain. Proxy evaluation metrics include test overall accuracy, model size, Spearman correlation between proxy scores and test OA, and search time.
  • Figure 3: Spearman correlation and proxy-proposed OA across different model size constraints on the Indian Pines dataset.
  • Figure 4: Spearman correlations between final test OA and components of the search space across five datasets. $num\_heads$ and $mlp\_ratio$ exhibit minimal correlation, while $embed\_dim$, $sum\_head\_dim$, and $sum\_mlp\_dim$ show high correlation across all datasets.
  • Figure 5: The best OA corresponding to different values of diverse factors, including the embedding dimension ($embed\_dim$), the depth ($depth$), the average number of heads ($mean\_heads\_num$), and the average MLP ratio ($mean\_mlp\_ratio$).
  • ...and 3 more figures