Table of Contents
Fetching ...

Fusing Models with Complementary Expertise

Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin

TL;DR

The paper tackles the challenge of generalizing across diverse data distributions by fusing outputs from multiple domain-specific experts. It formulates Fusion of Experts (FoE) as a supervised learning problem and demonstrates strong performance for discriminative and generative tasks, including image/text classification, summarization, QA, and automated evaluation of generated text. A frugal extension (FrugalFoE) reduces test-time costs by sequentially querying a subset of experts, formulated via a graph shortest-path view and kNN-based risk estimation. Across CIFAR-100, sentiment analysis, summarization, MMLU, and text generation tasks, FoE consistently outperforms individual experts and baselines, with FrugalFoE achieving substantial efficiency gains in practice. The approach emphasizes complementary information across experts and provides a scalable, easy-to-train fusion mechanism with practical frugality options.

Abstract

Training AI models that generalize across tasks and domains has long been among the open problems driving AI research. The emergence of Foundation Models made it easier to obtain expert models for a given task, but the heterogeneity of data that may be encountered at test time often means that any single expert is insufficient. We consider the Fusion of Experts (FoE) problem of fusing outputs of expert models with complementary knowledge of the data distribution and formulate it as an instance of supervised learning. Our method is applicable to both discriminative and generative tasks and leads to significant performance improvements in image and text classification, text summarization, multiple-choice QA, and automatic evaluation of generated text. We also extend our method to the "frugal" setting where it is desired to reduce the number of expert model evaluations at test time. Our implementation is publicly available at https://github.com/hwang595/FoE-ICLR2024.

Fusing Models with Complementary Expertise

TL;DR

The paper tackles the challenge of generalizing across diverse data distributions by fusing outputs from multiple domain-specific experts. It formulates Fusion of Experts (FoE) as a supervised learning problem and demonstrates strong performance for discriminative and generative tasks, including image/text classification, summarization, QA, and automated evaluation of generated text. A frugal extension (FrugalFoE) reduces test-time costs by sequentially querying a subset of experts, formulated via a graph shortest-path view and kNN-based risk estimation. Across CIFAR-100, sentiment analysis, summarization, MMLU, and text generation tasks, FoE consistently outperforms individual experts and baselines, with FrugalFoE achieving substantial efficiency gains in practice. The approach emphasizes complementary information across experts and provides a scalable, easy-to-train fusion mechanism with practical frugality options.

Abstract

Training AI models that generalize across tasks and domains has long been among the open problems driving AI research. The emergence of Foundation Models made it easier to obtain expert models for a given task, but the heterogeneity of data that may be encountered at test time often means that any single expert is insufficient. We consider the Fusion of Experts (FoE) problem of fusing outputs of expert models with complementary knowledge of the data distribution and formulate it as an instance of supervised learning. Our method is applicable to both discriminative and generative tasks and leads to significant performance improvements in image and text classification, text summarization, multiple-choice QA, and automatic evaluation of generated text. We also extend our method to the "frugal" setting where it is desired to reduce the number of expert model evaluations at test time. Our implementation is publicly available at https://github.com/hwang595/FoE-ICLR2024.
Paper Structure (37 sections, 10 equations, 5 figures, 14 tables)

This paper contains 37 sections, 10 equations, 5 figures, 14 tables.

Figures (5)

  • Figure 1: Three experts with complementary expertise (geometry, natural science, and history) process an input question on the Pythagorean theorem. They each output responses that are processed by a Fusion of Experts (FoE) model to arrive at a final output. Note that only one expert is capable of producing a high-quality output, thus ensembling the experts is likely to perform poorly.
  • Figure 2: Frugal CIFAR-100 w/ various $\kappa$.
  • Figure 3: Average confusion matrix across Monte Carlo repetitions.
  • Figure 4: Overlapping among sub-classes among the 20 partitions/experts.
  • Figure 5: FrugalFoE on CIFAR-100 with neural network as the fuser model.