MeGA: Merging Multiple Independently Trained Neural Networks Based on Genetic Algorithm
Daniel Yun
TL;DR
MeGA addresses the problem of merging multiple independently trained CNNs with identical architectures by optimizing weight fusion with a genetic algorithm. It defines a GA framework with initialization, tournament selection, crossover, and mutation, using validation accuracy as the fitness to guide evolution, and supports hierarchical multi-model merging. Experiments on CIFAR-10 show MeGA consistently improves final test accuracy over individual models and simple weight averaging across ResNet, Xception, and DenseNet families, with hierarchical merging of eight models further demonstrating scalability. The approach enables efficient, distributed-friendly integration of pre-trained networks, providing a scalable tool for enhancing neural network performance without additional training or architectural changes.
Abstract
In this paper, we introduce a novel method for merging the weights of multiple pre-trained neural networks using a genetic algorithm called MeGA. Traditional techniques, such as weight averaging and ensemble methods, often fail to fully harness the capabilities of pre-trained networks. Our approach leverages a genetic algorithm with tournament selection, crossover, and mutation to optimize weight combinations, creating a more effective fusion. This technique allows the merged model to inherit advantageous features from both parent models, resulting in enhanced accuracy and robustness. Through experiments on the CIFAR-10 dataset, we demonstrate that our genetic algorithm-based weight merging method improves test accuracy compared to individual models and conventional methods. This approach provides a scalable solution for integrating multiple pre-trained networks across various deep learning applications. Github is available at: https://github.com/YUNBLAK/MeGA-Merging-Multiple-Independently-Trained-Neural-Networks-Based-on-Genetic-Algorithm
