MGE: A Training-Free and Efficient Model Generation and Enhancement Scheme
Xuan Wang, Zeshan Pang, Yuliang Lu, Xuehu Yan
TL;DR
This work tackles the high cost and data requirements of building large model pools by introducing MGE, a training-free scheme that generates and enhances neural networks via a GAN-like generator–discriminator setup operating in a frequency-domain latent space. By applying a discrete cosine transform (DCT) to map parameters to latent variables and retaining high-energy components while sampling unimportant parts from a normal distribution with an energy threshold $t$, MGE preserves core parameter distributions while enabling rapid generation of many models. The authors further extend to E-MGE, an evolutionary variant that uses mutation, fusion, and a fitness-based selection to promote models with strong quality and diversity, improving generalization and robustness, including adversarial defense capabilities. Experiments across MNIST, CIFAR-10, FashionMNIST, GTSRB, and mini-ImageNet demonstrate that generated models achieve comparable or superior accuracy to normally trained models, while reducing generation time to around 1% of conventional training, and that E-MGE seeds can generalize to new tasks with notable transferability. This approach offers a scalable, data-efficient path to building diverse model pools for security analysis, interpretability studies, and few-shot or adversarially robust applications.
Abstract
To provide a foundation for the research of deep learning models, the construction of model pool is an essential step. This paper proposes a Training-Free and Efficient Model Generation and Enhancement Scheme (MGE). This scheme primarily considers two aspects during the model generation process: the distribution of model parameters and model performance. Experiments result shows that generated models are comparable to models obtained through normal training, and even superior in some cases. Moreover, the time consumed in generating models accounts for only 1\% of the time required for normal model training. More importantly, with the enhancement of Evolution-MGE, generated models exhibits competitive generalization ability in few-shot tasks. And the behavioral dissimilarity of generated models has the potential of adversarial defense.
