XTransfer: Modality-Agnostic Few-Shot Model Transfer for Human Sensing at the Edge
Yu Zhang, Xi Zhang, Hualin Zhou, Xinyuan Chen, Shang Gao, Hong Jia, Jianfei Yang, Yuankai Qi, Tao Gu
TL;DR
This work addresses cross-modality, few-shot model transfer for human sensing on edge devices, where data scarcity and resource constraints hinder deployment. It introduces XTransfer, a modality-agnostic framework combining a Splice–Repair–Removal SRR pipeline with a Layer-Wise Search (LWS) mechanism to repair and restructure pre-trained models using only a few sensor samples. The core ideas include anchor-based latent space alignment in a reduced PCA space, an anchor-based repair loss to minimize layer-wise distribution shifts, and an efficient, NAS-inspired layer recombining strategy under resource budgets. Experimental results demonstrate state-of-the-art accuracy and substantial reductions in data needs, training time, and edge-deployment costs across multiple sensing modalities and datasets. Overall, XTransfer offers a scalable, practical path to reuse public pre-trained models for diverse edge sensing tasks with limited labeled data.
Abstract
Deep learning for human sensing on edge systems presents significant potential for smart applications. However, its training and development are hindered by the limited availability of sensor data and resource constraints of edge systems. While transferring pre-trained models to different sensing applications is promising, existing methods often require extensive sensor data and computational resources, resulting in high costs and limited transferability. In this paper, we propose XTransfer, a first-of-its-kind method enabling modality-agnostic, few-shot model transfer with resource-efficient design. XTransfer flexibly uses pre-trained models and transfers knowledge across different modalities by (i) model repairing that safely mitigates modality shift by adapting pre-trained layers with only few sensor data, and (ii) layer recombining that efficiently searches and recombines layers of interest from source models in a layer-wise manner to restructure models. We benchmark various baselines across diverse human sensing datasets spanning different modalities. The results show that XTransfer achieves state-of-the-art performance while significantly reducing the costs of sensor data collection, model training, and edge deployment.
