Expand Heterogeneous Learning Systems with Selective Multi-Source Knowledge Fusion
Gaole Dai, Huatao Xu, Yifan Yang, Rui Tan, Mo Li
TL;DR
The paper tackles expanding heterogeneous learning systems to new target domains under label scarcity and data/device heterogeneity. It introduces HaT, a three-stage framework combining Efficient Model Selection Protocol, Sample-wise Knowledge Fusion, and Adaptive Knowledge Injection to select $N_p$ high-quality source models, fuse their predictions with per-sample attention weights, and distill the fused knowledge into a target model with an adaptive weighting scheme $\alpha$, while using a Knowledge Dictionary to stabilize learning. The approach achieves up to $16.5\%$ accuracy gains and up to $39\%$ reductions in communication traffic, with substantial reductions in training time and memory via low-cost joint training and partial encoder freezing. By handling multi-source, multi-architecture, and non-IID data across rounds $j$ in MRSE with fraction $\gamma$ of labeled target data, HaT provides a practical, scalable solution for real-world learning systems expansion across domains.
Abstract
Expanding existing learning systems to provide high-quality customized models for more domains, such as new users, is challenged by the limited labeled data and the data and device heterogeneities. While knowledge distillation methods could overcome label scarcity and device heterogeneity, they assume the teachers are fully reliable and overlook the data heterogeneity, which prevents the direct adoption of existing models. To address this problem, this paper proposes a framework, HaT, to expand learning systems. It first selects multiple high-quality models from the system at a low cost and then fuses their knowledge by assigning sample-wise weights to their predictions. Later, the fused knowledge is selectively injected into the customized models based on the knowledge quality. Extensive experiments on different tasks, modalities, and settings show that HaT outperforms state-of-the-art baselines by up to 16.5% accuracy and saves up to 39% communication traffic.
