MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets

Peng Liao; XiLu Wang; Yaochu Jin; WenLi Du

MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets

Peng Liao, XiLu Wang, Yaochu Jin, WenLi Du

TL;DR

The paper addresses the challenge of deploying models across diverse devices by balancing accuracy and efficiency for tasks from different datasets. It introduces MO-EMT-NAS, a multi-objective multi-task evolutionary NAS framework that uses per-task weight-sharing supernets and cross-task mating to transfer architectural knowledge while maintaining diversity with an auxiliary objective. The auxiliary objective, defined as $f_a = (1-params) e^{-(1-params)(1-error)}$ with $params$ and $error$ normalized to $[0,1]$, promotes larger models at similar accuracy to counter the small-model trap, and parallel training further speeds up search. Across seven datasets and 2–4 task settings, MO-EMT-NAS achieves superior Pareto fronts (HV) and up to 77.7% runtime reduction compared with multi-task single-task baselines, while transferring architectures effectively to ImageNet and medical datasets, demonstrating practical benefits for cross-dataset NAS.

Abstract

Deploying models across diverse devices demands tradeoffs among multiple objectives due to different resource constraints. Arguably, due to the small model trap problem in multi-objective neural architecture search (MO-NAS) based on a supernet, existing approaches may fail to maintain large models. Moreover, multi-tasking neural architecture search (MT-NAS) excels in handling multiple tasks simultaneously, but most existing efforts focus on tasks from the same dataset, limiting their practicality in real-world scenarios where multiple tasks may come from distinct datasets. To tackle the above challenges, we propose a Multi-Objective Evolutionary Multi-Tasking framework for NAS (MO-EMT-NAS) to achieve architectural knowledge transfer across tasks from different datasets while finding Pareto optimal architectures for multi-objectives, model accuracy and computational efficiency. To alleviate the small model trap issue, we introduce an auxiliary objective that helps maintain multiple larger models of similar accuracy. Moreover, the computational efficiency is further enhanced by parallelizing the training and validation of the weight-sharing-based supernet. Experimental results on seven datasets with two, three, and four task combinations show that MO-EMT-NAS achieves a better minimum classification error while being able to offer flexible trade-offs between model performance and complexity, compared to the state-of-the-art single-objective MT-NAS algorithms. The runtime of MO-EMT-NAS is reduced by 59.7% to 77.7%, compared to the corresponding multi-objective single-task approaches.

MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets

TL;DR

with

and

normalized to

, promotes larger models at similar accuracy to counter the small-model trap, and parallel training further speeds up search. Across seven datasets and 2–4 task settings, MO-EMT-NAS achieves superior Pareto fronts (HV) and up to 77.7% runtime reduction compared with multi-task single-task baselines, while transferring architectures effectively to ImageNet and medical datasets, demonstrating practical benefits for cross-dataset NAS.

Abstract

Paper Structure (15 sections, 3 equations, 4 figures, 7 tables)

This paper contains 15 sections, 3 equations, 4 figures, 7 tables.

Introduction
Related Work
Approach
MO-EMT-NAS
Auxiliary Third Objective
Parallel Training and Evaluation
Experiments
Settings
Performance Indicator
Two-task on CIFAR-10 and CIFAR-100
Transfer to ImageNet
Medical Multi-Objective Multi-Tasking
Ablation studies
Sensitivity Analysis
Conclusion

Figures (4)

Figure 1: A flowchart of MO-EMT-NAS, using a two-task scenario as an example. Different from single-task multi-objective evolutionary approaches, in MO-EMT-NAS, individuals in a population belong to different tasks, and cross-task reproduction enables implicit knowledge transfer. To alleviate negative transfer, each task operates with its own supernet and the corresponding parameter set. Hence, implementing the concept of multiprocessing naturally separates the optimization algorithm from each task's training and validation, further enhancing the computational efficiency.
Figure 2: Comparison of the impact of different evolutionary strategies on the convergence of the population, each showing the population distribution in the initial, second, fourth, sixth, eighth, and the final generations. By simultaneously optimizing the model error, size and the proposed auxiliary objective, we can obtain models of various sizes.
Figure 3: Visualization of the two tasks of CIFAR-10 and CIFAR-100. The corresponding model error and size are summarized in Table A in the Supplementary Material.
Figure 4: Visualization on MedMNIST with two-, three- and four-tasking settings. The values of the task relatednesss score are shown in Fig. \ref{['Relatedness scores']}. The corresponding model error and size are summarized in Table D-N in the Supplementary Material.

MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets

TL;DR

Abstract

MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets

Authors

TL;DR

Abstract

Table of Contents

Figures (4)