Table of Contents
Fetching ...

A Structure-Agnostic Co-Tuning Framework for LLMs and SLMs in Cloud-Edge Systems

Yuze Liu, Yunhan Wang, Tiehua Zhang, Zhishu Shen, Cheng Peng, Libing Wu, Feng Xia, Jiong Jin

TL;DR

Co-PLMs tackles bandwidth and privacy bottlenecks in cloud-edge LLM deployment by introducing Distilled Proxy Models to bridge server LLMs and on-device SLMs. It combines Domain-Specific Tuning (DST) and Structure-Agnostic Mutual Learning (SAML) to enable bidirectional knowledge exchange despite heterogeneous architectures. The approach yields state-of-the-art Rouge-L and EM gains on multi-domain QA benchmarks while dramatically reducing communication overhead by exchanging only DPM parameters. This work enables practical, privacy-preserving cloud-edge collaboration for heterogeneous LLM–SLM ecosystems and provides publicly available code for reproducibility.

Abstract

The surge in intelligent applications driven by large language models (LLMs) has made it increasingly difficult for bandwidth-limited cloud servers to process extensive LLM workloads in real time without compromising user data privacy. To solve these problems, recent research has focused on constructing cloud-edge consortia that integrate server-based LLM with small language models (SLMs) on mobile edge devices. Furthermore, designing collaborative training mechanisms within such consortia to enhance inference performance has emerged as a promising research direction. However, the cross-domain deployment of SLMs, coupled with structural heterogeneity in SLMs architectures, poses significant challenges to enhancing model performance. To this end, we propose Co-PLMs, a novel co-tuning framework for collaborative training of large and small language models, which integrates the process of structure-agnostic mutual learning to realize knowledge exchange between the heterogeneous language models. This framework employs distilled proxy models (DPMs) as bridges to enable collaborative training between the heterogeneous server-based LLM and on-device SLMs, while preserving the domain-specific insights of each device. The experimental results show that Co-PLMs outperform state-of-the-art methods, achieving average increases of 5.38% in Rouge-L and 4.88% in EM.

A Structure-Agnostic Co-Tuning Framework for LLMs and SLMs in Cloud-Edge Systems

TL;DR

Co-PLMs tackles bandwidth and privacy bottlenecks in cloud-edge LLM deployment by introducing Distilled Proxy Models to bridge server LLMs and on-device SLMs. It combines Domain-Specific Tuning (DST) and Structure-Agnostic Mutual Learning (SAML) to enable bidirectional knowledge exchange despite heterogeneous architectures. The approach yields state-of-the-art Rouge-L and EM gains on multi-domain QA benchmarks while dramatically reducing communication overhead by exchanging only DPM parameters. This work enables practical, privacy-preserving cloud-edge collaboration for heterogeneous LLM–SLM ecosystems and provides publicly available code for reproducibility.

Abstract

The surge in intelligent applications driven by large language models (LLMs) has made it increasingly difficult for bandwidth-limited cloud servers to process extensive LLM workloads in real time without compromising user data privacy. To solve these problems, recent research has focused on constructing cloud-edge consortia that integrate server-based LLM with small language models (SLMs) on mobile edge devices. Furthermore, designing collaborative training mechanisms within such consortia to enhance inference performance has emerged as a promising research direction. However, the cross-domain deployment of SLMs, coupled with structural heterogeneity in SLMs architectures, poses significant challenges to enhancing model performance. To this end, we propose Co-PLMs, a novel co-tuning framework for collaborative training of large and small language models, which integrates the process of structure-agnostic mutual learning to realize knowledge exchange between the heterogeneous language models. This framework employs distilled proxy models (DPMs) as bridges to enable collaborative training between the heterogeneous server-based LLM and on-device SLMs, while preserving the domain-specific insights of each device. The experimental results show that Co-PLMs outperform state-of-the-art methods, achieving average increases of 5.38% in Rouge-L and 4.88% in EM.

Paper Structure

This paper contains 16 sections, 9 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: The architecture of the cloud-edge system.
  • Figure 2: The overview of our proposed framework.
  • Figure 3: Communication overheads under different settings.