Table of Contents
Fetching ...

When Sensing Varies with Contexts: Context-as-Transform for Tactile Few-Shot Class-Incremental Learning

Yifeng Lin, Aiping Huang, Wenxi Liu, Si Wu, Tiesong Zhao, Zheng-Jun Zha

Abstract

Few-Shot Class-Incremental Learning (FSCIL) can be particularly susceptible to acquisition contexts with only a few labeled samples. A typical scenario is tactile sensing, where the acquisition context ({\it e.g.}, diverse devices, contact state, and interaction settings) degrades performance due to a lack of standardization. In this paper, we propose Context-as-Transform FSCIL (CaT-FSCIL) to tackle the above problem. We decompose the acquisition context into a structured low-dimensional component and a high-dimensional residual component. The former can be easily affected by tactile interaction features, which are modeled as an approximately invertible Context-as-Transform family and handled via inverse-transform canonicalization optimized with a pseudo-context consistency loss. The latter mainly arises from platform and device differences, which can be mitigated with an Uncertainty-Conditioned Prototype Calibration (UCPC) that calibrates biased prototypes and decision boundaries based on context uncertainty. Comprehensive experiments on the standard benchmarks HapTex and LMT108 have demonstrated the superiority of the proposed CaT-FSCIL.

When Sensing Varies with Contexts: Context-as-Transform for Tactile Few-Shot Class-Incremental Learning

Abstract

Few-Shot Class-Incremental Learning (FSCIL) can be particularly susceptible to acquisition contexts with only a few labeled samples. A typical scenario is tactile sensing, where the acquisition context ({\it e.g.}, diverse devices, contact state, and interaction settings) degrades performance due to a lack of standardization. In this paper, we propose Context-as-Transform FSCIL (CaT-FSCIL) to tackle the above problem. We decompose the acquisition context into a structured low-dimensional component and a high-dimensional residual component. The former can be easily affected by tactile interaction features, which are modeled as an approximately invertible Context-as-Transform family and handled via inverse-transform canonicalization optimized with a pseudo-context consistency loss. The latter mainly arises from platform and device differences, which can be mitigated with an Uncertainty-Conditioned Prototype Calibration (UCPC) that calibrates biased prototypes and decision boundaries based on context uncertainty. Comprehensive experiments on the standard benchmarks HapTex and LMT108 have demonstrated the superiority of the proposed CaT-FSCIL.

Paper Structure

This paper contains 18 sections, 20 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Acquisition context degrades FSCIL through two coupled failure modes. (a) Context overfitting makes prototypes encode context-dependent cues and yields unstable target representations. Context variation failure occurs when a novel acquisition context is interpreted as the closest previously observed one. (b) CaT-FSCIL applies a learned context transform to canonicalize the context-target feature into a more compact target representation, while handling the residual via context uncertainty to calibrate decision boundaries and mitigate misclassification.
  • Figure 2: The context transform is the core operation of Context-as-Transform (CaT), which uses low-dimensional context parameters to modulate acquisition-context effects and is shared across all modules in CaT-FSCIL. (i) At inference, the estimated context $c$ is used to canonicalize the observed spectrogram $\mathbf{M}_{y,c}$ into a material-centric representation $\mathbf{M}_y$, which is fed to the embedding learner and UCPC. (ii) During training, pseudo-contexts $\tilde{c}_p$ are injected to optimize the context estimator via pseudo-context consistency. (iii) In UCPC, $n_{\mathrm{ucpc}}$ pseudo-contexts are sampled to estimate context uncertainty from canonicalization stability and calibrate the classifier.
  • Figure 3: Photographs show example materials from LMT108. HapTex groups materials into coarse categories ( e.g., velvet, leather, chiffon, wool, nylon, polyester, linen, and silk). LMT108 defines nine coarse categories, including meshes, stones, blank glossy surfaces, wood types, rubbers, fibers, foams, foils and papers, and textiles and fabrics. We apply dataset-specific normalization for numerical stability and ease of use. Context-dependent variations persist after this preprocessing, motivating CaT-FSCIL.
  • Figure 4: Performance comparison and ablation results. (a) HapTex, (b) LMT108, and (c) ablation on HapTex. CaT-FSCIL achieves the strongest overall performance across both datasets and remains robust to tactile-specific context--material entanglement, which can otherwise distort representations and impair FSCIL baselines.
  • Figure 5: Effect of the pseudo-context sampling count $n_{\mathrm{ucpc}}$ on performance for (a) HapTex and (b) LMT108. HapTex is comparatively stable across $n_{\mathrm{ucpc}}$, LMT108 shows larger variation.
  • ...and 1 more figures