Channel-Aware Probing for Multi-Channel Imaging

Umar Marikkar; Syed Sameed Husain; Muhammad Awais; Sara Atito

Channel-Aware Probing for Multi-Channel Imaging

Umar Marikkar, Syed Sameed Husain, Muhammad Awais, Sara Atito

TL;DR

Channel-Aware Probing (CAP) tackles the challenge of heterogeneous channel configurations in Multi-Channel Imaging by exploiting inherent inter-channel diversity. CAP introduces Independent Feature Encoding (IFE) to encode each channel separately and Decoupled Pooling (DCP) to pool within channels before cross-channel aggregation, thereby preserving channel-specific information during probing. Empirical results on CHAMMI, JUMP-CP, and So2Sat show that CAP consistently outperforms standard probing baselines and closes a large portion of the gap to full fine-tuning, effectively matching scratch-fine-tuning performance in many cases. The work also analyzes the contributions and complexity of CAP, demonstrating robust improvements across multiple pooling architectures and pre-trained encoders, and providing a practical, compute-efficient approach for probing multi-channel imaging representations. CAP thus sets a stronger, channel-aware baseline for probing in MCI and facilitates better reuse of pre-trained MCI encoders in downstream tasks.

Abstract

Training and evaluating vision encoders on Multi-Channel Imaging (MCI) data remains challenging as channel configurations vary across datasets, preventing fixed-channel training and limiting reuse of pre-trained encoders on new channel settings. Prior work trains MCI encoders but typically evaluates them via full fine-tuning, leaving probing with frozen pre-trained encoders comparatively underexplored. Existing studies that perform probing largely focus on improving representations, rather than how to best leverage fixed representations for downstream tasks. Although the latter problem has been studied in other domains, directly transferring those strategies to MCI yields weak results, even worse than training from scratch. We therefore propose Channel-Aware Probing (CAP), which exploits the intrinsic inter-channel diversity in MCI datasets by controlling feature flow at both the encoder and probe levels. CAP uses Independent Feature Encoding (IFE) to encode each channel separately, and Decoupled Pooling (DCP) to pool within channels before aggregating across channels. Across three MCI benchmarks, CAP consistently improves probing performance over the default probing protocol, matches fine-tuning from scratch, and largely reduces the gap to full fine-tuning from the same MCI pre-trained checkpoints. Code can be found in https://github.com/umarikkar/CAP.

Channel-Aware Probing for Multi-Channel Imaging

TL;DR

Abstract

Paper Structure (21 sections, 6 equations, 8 figures, 2 tables)

This paper contains 21 sections, 6 equations, 8 figures, 2 tables.

Introduction
Related work
Multi-Channel Vision Transformers (MC-VITs)
Probing Architectures
Methodology
Exploring distinct properties in MCI datasets
Channel-Aware Probing for Multi-Channel Imaging
Maximizing feature diversity via Independent Feature Encoding.
Leveraging diverse channel features via Decoupled Pooling.
Datasets and Tasks.
Baselines and Comparisons.
Implementation details.
Results
Downstream performance on MCI benchmarks
Performance of CAP across pooling architectures.
...and 6 more sections

Figures (8)

Figure 1:
Figure 2:
Figure 4: Comparison of inter-channel feature diversity between MCI and RGB datasets. We compute the cosine similarity between $\mathtt{[cls]}$ tokens of individual channel features within a given instance, averaged over 1000 random instances.
Figure 5: (a) Joint feature encoding, the default encoding protocol in MC-ViTs. Information is shared between channels, resulting in pre-conditioned output features. (b) Independent feature encoding, where information is not shared between channels in the computation of the overall feature map. $\mathrm{x}_i$ denotes the set of tokenized embeddings for channel $i$.
Figure 6: (a) JAP: All channel-wise features are concatenated into a single sequence and then pooled. (b) DCP: Each set of channel-wise features is pooled independently to yield a set of single feature vectors per channel, which are pooled again via same function. green arrows $\rightarrow$: reshaping operations.
...and 3 more figures

Channel-Aware Probing for Multi-Channel Imaging

TL;DR

Abstract

Channel-Aware Probing for Multi-Channel Imaging

Authors

TL;DR

Abstract

Table of Contents

Figures (8)