MEL: Efficient Multi-Task Evolutionary Learning for High-Dimensional Feature Selection
Xubin Wang, Haojiong Shangguan, Fengyi Huang, Shangrui Wu, Weijia Jia
TL;DR
MEL introduces a PSO-based multi-task evolutionary learning framework for high-dimensional feature selection. It splits the population into two subpopulations that learn feature importance and transfer knowledge to guide search, balancing accuracy and parsimony through a fitness function that penalizes large feature subsets. Extensive experiments across 12 high-dimensional genetic datasets and 10 large-sample datasets show MEL achieves superior or competitive accuracy while producing compact feature subsets and with favorable running times relative to a wide range of baselines. The approach demonstrates scalable, effective feature selection in ultra-high-dimensional settings and is complemented by open-source code for reproducibility and practical adoption.
Abstract
Feature selection is a crucial step in data mining to enhance model performance by reducing data dimensionality. However, the increasing dimensionality of collected data exacerbates the challenge known as the "curse of dimensionality", where computation grows exponentially with the number of dimensions. To tackle this issue, evolutionary computational (EC) approaches have gained popularity due to their simplicity and applicability. Unfortunately, the diverse designs of EC methods result in varying abilities to handle different data, often underutilizing and not sharing information effectively. In this paper, we propose a novel approach called PSO-based Multi-task Evolutionary Learning (MEL) that leverages multi-task learning to address these challenges. By incorporating information sharing between different feature selection tasks, MEL achieves enhanced learning ability and efficiency. We evaluate the effectiveness of MEL through extensive experiments on 22 high-dimensional datasets. Comparing against 24 EC approaches, our method exhibits strong competitiveness. Additionally, we have open-sourced our code on GitHub at https://github.com/wangxb96/MEL.
