Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning
Zuheng Kang, Yayun He, Jianzong Wang, Junqing Peng, Jing Xiao
TL;DR
The paper tackles redundancy and inefficiency in multi-model fusion by introducing Adversarial Complementary Representation Learning (ACoRL), which compels a new alliance model to learn distinct, complementary representations from a set of pre-trained models using a gradient reversal mechanism. The approach defines a min–max objective that jointly optimizes a task loss and an adversarial loss to expand the latent representation space, promoting diversity across models. Empirical results on image classification (ImageNet-100) and speaker verification (VoxCeleb1/2) show that ACoRL-based fusion outperforms traditional MMF, with attribution analyses confirming the emergence of complementary knowledge in the learned representations. Overall, ACoRL offers a general, efficient framework for robust multi-model fusion with potential applicability across diverse domains and tasks.
Abstract
Single-model systems often suffer from deficiencies in tasks such as speaker verification (SV) and image classification, relying heavily on partial prior knowledge during decision-making, resulting in suboptimal performance. Although multi-model fusion (MMF) can mitigate some of these issues, redundancy in learned representations may limits improvements. To this end, we propose an adversarial complementary representation learning (ACoRL) framework that enables newly trained models to avoid previously acquired knowledge, allowing each individual component model to learn maximally distinct, complementary representations. We make three detailed explanations of why this works and experimental results demonstrate that our method more efficiently improves performance compared to traditional MMF. Furthermore, attribution analysis validates the model trained under ACoRL acquires more complementary knowledge, highlighting the efficacy of our approach in enhancing efficiency and robustness across tasks.
