Backpropagation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration

Wei Ji; Li Li; Zheqi Lv; Wenqiao Zhang; Mengze Li; Zhen Wan; Wenqiang Lei; Roger Zimmermann

Backpropagation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration

Wei Ji, Li Li, Zheqi Lv, Wenqiao Zhang, Mengze Li, Zhen Wan, Wenqiang Lei, Roger Zimmermann

TL;DR

A universal on-device Multi-modal Model Adaptation framework is introduced, revolutionizing on-device model adaptation by striking a balance between efficiency and effectiveness, and represents a pioneering solution for on-Device Multi-modal Model Adaptation (DMMA).

Abstract

In our increasingly interconnected world, where intelligent devices continually amass copious personalized multi-modal data, a pressing need arises to deliver high-quality, personalized device-aware services. However, this endeavor presents a multifaceted challenge to prevailing artificial intelligence (AI) systems primarily rooted in the cloud. As these systems grapple with shifting data distributions between the cloud and devices, the traditional approach of fine-tuning-based adaptation (FTA) exists the following issues: the costly and time-consuming data annotation required by FTA and the looming risk of model overfitting. To surmount these challenges, we introduce a Universal On-Device Multi-modal Model Adaptation Framework, revolutionizing on-device model adaptation by striking a balance between efficiency and effectiveness. The framework features the Fast Domain Adaptor (FDA) hosted in the cloud, providing tailored parameters for the Lightweight Multi-modal Model on devices. To enhance adaptability across multi-modal tasks, the AnchorFrame Distribution Reasoner (ADR) minimizes communication costs. Our contributions, encapsulated in the Cloud-Device Collaboration Multi-modal Parameter Generation (CDC-MMPG) framework, represent a pioneering solution for on-Device Multi-modal Model Adaptation (DMMA). Extensive experiments validate the efficiency and effectiveness of our method, particularly in video question answering and retrieval tasks, driving forward the integration of intelligent devices into our daily lives.

Backpropagation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration

TL;DR

Abstract

Paper Structure (15 sections, 16 equations, 2 figures, 6 tables, 1 algorithm)

This paper contains 15 sections, 16 equations, 2 figures, 6 tables, 1 algorithm.

Introduction
Related Works
Methodology
Preliminary
Fast Domain Adaptor
AnchorFrame Distribution Reasoner
Experiment
Datasets
Evaluation Metrics
Tasks and Implementation Details
Performance Comparison
Baseline Methods.
Main Results
Ablation Studies
Conclusion

Figures (2)

Figure 1: (a) Multi-modal data on cloud and different devices exist in different distributions due to the personalized preference of users. (b) Compared with conventional methods of deploying models on different devices, we propose an FDA that can achieve a balance of efficiency and effectiveness.
Figure 2: Illustration of the overall pipeline of our method, CDC-MMPG. (a) and (b) represent the Cloud model, which reconstructs the video features uploaded from the device and reasons out the personal parameters of the device model based on the reconstructed video features. (c) represents the lightweight multi-modal device-side model, which extracts the multi-modal features, and uploads the video features to the cloud model for the personal device-model parameter prediction. After being updated with the personal parameters, the lightweight multi-modal device-side model will further analyze the multi-modal features and make the final prediction.

Backpropagation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration

TL;DR

Abstract

Backpropagation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration

Authors

TL;DR

Abstract

Table of Contents

Figures (2)