Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Zheqi Lv; Keming Ye; Zishu Wei; Qi Tian; Shengyu Zhang; Wenqiao Zhang; Wenjie Wang; Kun Kuang; Tat-Seng Chua; Fei Wu

Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Zheqi Lv, Keming Ye, Zishu Wei, Qi Tian, Shengyu Zhang, Wenqiao Zhang, Wenjie Wang, Kun Kuang, Tat-Seng Chua, Fei Wu

TL;DR

CKI introduces a compatibility-aware knowledge integration framework to directly optimize incompatible parameters without adding inference cost. It simultaneously evaluates local parameter uncertainty and global model information content to compute a per-parameter compatibility, then applies hard or soft splicing to fuse parameters from multiple pretrained models. The approach is validated on recommendation and language tasks, showing consistent improvements over pruning, averaging, and ensemble baselines and even enabling effective initialization with just one retraining epoch. CKI’s ability to leverage complementary strengths across models while preserving the original architecture makes it a practical and scalable solution for robust deployment under distribution shifts.

Abstract

Deep neural networks have become foundational to advancements in multiple domains, including recommendation systems, natural language processing, and so on. Despite their successes, these models often contain incompatible parameters that can be underutilized or detrimental to model performance, particularly when faced with specific, varying data distributions. Existing research excels in removing such parameters or merging the outputs of multiple different pretrained models. However, the former focuses on efficiency rather than performance, while the latter requires several times more computing and storage resources to support inference. In this paper, we set the goal to explicitly improve these incompatible parameters by leveraging the complementary strengths of different models, thereby directly enhancing the models without any additional parameters. Specifically, we propose Compatibility-aware Knowledge Integration (CKI), which consists of Parameter Compatibility Assessment and Parameter Splicing, which are used to evaluate the knowledge content of multiple models and integrate the knowledge into one model, respectively. The integrated model can be used directly for inference or for further fine-tuning. We conduct extensive experiments on various datasets for recommendation and language tasks, and the results show that Compatibility-aware Knowledge Integration can effectively optimize incompatible parameters under multiple tasks and settings to break through the training limit of the original model without increasing the inference cost.

Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

TL;DR

Abstract

Paper Structure (31 sections, 18 equations, 5 figures, 8 tables, 1 algorithm)

This paper contains 31 sections, 18 equations, 5 figures, 8 tables, 1 algorithm.

Introduction
Related Work
Methodology
Problem Formulation and Notations
CKI for Dual Models
Parameter Compatibility Assessment
Parameter Splicing
CKI for Multiple Models
Parameter Compatibility Assessment
Parameter Splicing
Experiments
Experimental Setup
Datasets
Baselines
Evaluation Metrics
...and 16 more sections

Figures (5)

Figure 1: (a) shows the Incompatible Parameter issue. (b) describes Model Pruning, which removes incompatible parameters from $M_A$. (c) presents output ensemble, which combines the inference results of $M_A$ and $M_B$ for a final result. (d) introduces CKI, which evaluates each parameter's compatibility in global and local views, then integrates the knowledge of $M_A$ and $M_B$ to get model $M_C$. (e) shows that CKI outperforms baselines in different scenarios.
Figure 2: Overview of the proposed CKI. Our CKI includes two parts: Parameter Compatibility Assessment and Parameter Splicing. (a) describes the Parameter Compatibility Assessment. It consists of 3 parts: (a1) Local-level Parameter Uncertainty Assessment, (a2) Global-level Model Information Content Assessment, and (a3) Dual-Perspective Parameter Compatibility Assessment. (b) describes the Parameter Splicing, which includes (b1) Hard Splicing and (b2) Soft Splicing. (c) describes the extension of CKI from 2 models to multiple models.
Figure 3: Performance comparison of the proposed method and baselines on recommendation task when the pre-trained model is a static models.
Figure 4: Performance comparison of the proposed method and baselines on recommendation task when pre-trained models include both static and dynamic models.
Figure 5: The impact of the number of integrated models.

Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

TL;DR

Abstract

Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Authors

TL;DR

Abstract

Table of Contents

Figures (5)