InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion

Zhaoyi Yan; Yiming Zhang; Baoyi He; Yuhao Fu; Qi Zhou; Zhijie Sang; Chunlin Ji; Shengyu Zhang; Fei Wu; Hongxia Yang

InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion

Zhaoyi Yan, Yiming Zhang, Baoyi He, Yuhao Fu, Qi Zhou, Zhijie Sang, Chunlin Ji, Shengyu Zhang, Fei Wu, Hongxia Yang

TL;DR

InfiFusion tackles the challenge of integrating multiple domain-specialized LLMs into a single high-performing pivot model without heavy parameter merging. It extends Universal Logit Distillation with Top-K logit selection and logits standardization, and offers Pairwise Fusion and Unified Fusion strategies to distill and merge knowledge efficiently across heterogeneous teachers. Empirical results across 11 benchmarks show that InfiFusion surpasses state-of-the-art models like Qwen-2.5-14B-Instruct and Phi-4 while using orders of magnitude fewer GPU hours. The approach provides a scalable, flexible pathway to deploy compact, high-performance LLMs in diverse tasks including reasoning, coding, mathematics, and instruction-following.

Abstract

We introduce InfiFusion, an efficient training pipeline designed to integrate multiple domain-specialized Large Language Models (LLMs) into a single pivot model, effectively harnessing the strengths of each source model. Traditional fusion methods either merge model parameters directly or rely on knowledge distillation with rigid assumptions, limiting their flexibility and efficiency. InfiFusion overcomes these limitations by enhancing Universal Logit Distillation (ULD) with Top-K selection and Logits Standardization. We propose two fusion strategies: Pairwise Fusion (InfiFusion$_p$), where each source model knowledge is distilled individually into the pivot model followed by merging and Unified Fusion (InfiFusion$_u$), where knowledge from all source models is distilled simultaneously into the pivot model. InfiFusion outperforms the state-of-the-art models, such as Qwen-2.5-14B-Instruct and Phi-4, across 11 widely applied benchmarks covering reasoning, coding, mathematics, and instruction-following tasks. Notably, InfiFusion achieves this superior performance while significantly reduces computational costs, completing full training with only 160 H800 GPU hours compared to the millions typically required for traditional LLM training.

InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion

TL;DR

Abstract

), where each source model knowledge is distilled individually into the pivot model followed by merging and Unified Fusion (InfiFusion

), where knowledge from all source models is distilled simultaneously into the pivot model. InfiFusion outperforms the state-of-the-art models, such as Qwen-2.5-14B-Instruct and Phi-4, across 11 widely applied benchmarks covering reasoning, coding, mathematics, and instruction-following tasks. Notably, InfiFusion achieves this superior performance while significantly reduces computational costs, completing full training with only 160 H800 GPU hours compared to the millions typically required for traditional LLM training.

Paper Structure (25 sections, 21 equations, 2 figures, 8 tables)

This paper contains 25 sections, 21 equations, 2 figures, 8 tables.

Introduction
Related Work
Model Merging
Knowledge Distillation
Model Fusion
Methods
Preliminary
Supervised Fine-Tuning (SFT)
Optimal Transport Loss
Discrete 1-Wasserstein
Universal Logit Distillation Loss
Top-K Selection and Logits Standardization
Top-K Selection
Logits Standardization
Fusion Strategies
...and 10 more sections

Figures (2)

Figure 1: Performance of InfiFusion on the pivot models Phi-4 and Qwen2.5-14B-Instruct. In all cases, InfiFusion significantly outperforms the pivot model in terms of average score. Notably, InfiFusion requires only approximately $1/10000$ of the GPU hours compared to LLMs, such as Qwen2.5-14B-Instruct, Phi-4, etc..
Figure 2: Illustration of the InfiFusion framework, incorporating Top-K selection and logits standardization. Two fusion strategies are proposed: Pairwise Fusion, where each source model’s knowledge is distilled separately into the pivot model, and Unified Fusion, which aggregates knowledge from all source models simultaneously.

InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion

TL;DR

Abstract

InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion

Authors

TL;DR

Abstract

Table of Contents

Figures (2)