Table of Contents
Fetching ...

FusionBench: A Unified Library and Comprehensive Benchmark for Deep Model Fusion

Anke Tang, Li Shen, Yong Luo, Enneng Yang, Han Hu, Lefei Zhang, Bo Du, Dacheng Tao

TL;DR

FusionBench introduces the first unified benchmark and library for deep model fusion, addressing inconsistent evaluations by providing a modular, Hydra-configured framework with Algorithm, Model Pool, and Task Pool components. It implements a broad taxonomy of fusion methods (ensemble, merging, mixing) and includes model-collection integrations across CLIP-ViT, ResNet-50, GPT-2, and Flan-T5, enabling cross-domain multi-task evaluation. Experimental results show adaptive and MoE-based fusion methods often outperform baselines and pre-trained models, while highlighting generalization and robustness challenges on unseen tasks and corrupted data, as well as scaling potential for large models and LLMs. The work provides extensive documentation, tutorials, and tutorials and invites community contributions to advance standardized evaluation and development in deep model fusion.

Abstract

Deep model fusion is an emerging technique that unifies the predictions or parameters of several deep neural networks into a single better-performing model in a cost-effective and data-efficient manner. Although a variety of deep model fusion techniques have been introduced, their evaluations tend to be inconsistent and often inadequate to validate their effectiveness and robustness. We present FusionBench, the first benchmark and a unified library designed specifically for deep model fusion. Our benchmark consists of multiple tasks, each with different settings of models and datasets. This variety allows us to compare fusion methods across different scenarios and model scales. Additionally, FusionBench serves as a unified library for easy implementation and testing of new fusion techniques. FusionBench is open source and actively maintained, with community contributions encouraged. Homepage https://github.com/tanganke/fusion_bench

FusionBench: A Unified Library and Comprehensive Benchmark for Deep Model Fusion

TL;DR

FusionBench introduces the first unified benchmark and library for deep model fusion, addressing inconsistent evaluations by providing a modular, Hydra-configured framework with Algorithm, Model Pool, and Task Pool components. It implements a broad taxonomy of fusion methods (ensemble, merging, mixing) and includes model-collection integrations across CLIP-ViT, ResNet-50, GPT-2, and Flan-T5, enabling cross-domain multi-task evaluation. Experimental results show adaptive and MoE-based fusion methods often outperform baselines and pre-trained models, while highlighting generalization and robustness challenges on unseen tasks and corrupted data, as well as scaling potential for large models and LLMs. The work provides extensive documentation, tutorials, and tutorials and invites community contributions to advance standardized evaluation and development in deep model fusion.

Abstract

Deep model fusion is an emerging technique that unifies the predictions or parameters of several deep neural networks into a single better-performing model in a cost-effective and data-efficient manner. Although a variety of deep model fusion techniques have been introduced, their evaluations tend to be inconsistent and often inadequate to validate their effectiveness and robustness. We present FusionBench, the first benchmark and a unified library designed specifically for deep model fusion. Our benchmark consists of multiple tasks, each with different settings of models and datasets. This variety allows us to compare fusion methods across different scenarios and model scales. Additionally, FusionBench serves as a unified library for easy implementation and testing of new fusion techniques. FusionBench is open source and actively maintained, with community contributions encouraged. Homepage https://github.com/tanganke/fusion_bench
Paper Structure (34 sections, 6 figures, 17 tables, 1 algorithm)

This paper contains 34 sections, 6 figures, 17 tables, 1 algorithm.

Figures (6)

  • Figure 1: The general framework of the modularized FusionBench codebase.
  • Figure 2: A taxonomy of deep model fusion techniques.
  • Figure 3: Flowchart of FusionBench.
  • Figure 4: Cosine similarity of task vectors for CLIP-ViT-B/32 and CLIP-ViT-L/14 models.
  • Figure 5: Radar charts comparing the performance of different deep model fusion methods across eight tasks using CLIP-ViT-B/32 and CLIP-ViT-L/14.
  • ...and 1 more figures