Table of Contents
Fetching ...

Embedded Multi-label Feature Selection via Orthogonal Regression

Xueyuan Xu, Fulin Wei, Tianyuan Jia, Li Zhuo, Feiping Nie, Xia Wu

TL;DR

A novel embedded multi-label feature selection method, termed global redundancy and relevance optimization in orthogonal regression (GRROOR), is proposed to facilitate the multi-label feature selection.

Abstract

In the last decade, embedded multi-label feature selection methods, incorporating the search for feature subsets into model optimization, have attracted considerable attention in accurately evaluating the importance of features in multi-label classification tasks. Nevertheless, the state-of-the-art embedded multi-label feature selection algorithms based on least square regression usually cannot preserve sufficient discriminative information in multi-label data. To tackle the aforementioned challenge, a novel embedded multi-label feature selection method, termed global redundancy and relevance optimization in orthogonal regression (GRROOR), is proposed to facilitate the multi-label feature selection. The method employs orthogonal regression with feature weighting to retain sufficient statistical and structural information related to local label correlations of the multi-label data in the feature learning process. Additionally, both global feature redundancy and global label relevancy information have been considered in the orthogonal regression model, which could contribute to the search for discriminative and non-redundant feature subsets in the multi-label data. The cost function of GRROOR is an unbalanced orthogonal Procrustes problem on the Stiefel manifold. A simple yet effective scheme is utilized to obtain an optimal solution. Extensive experimental results on ten multi-label data sets demonstrate the effectiveness of GRROOR.

Embedded Multi-label Feature Selection via Orthogonal Regression

TL;DR

A novel embedded multi-label feature selection method, termed global redundancy and relevance optimization in orthogonal regression (GRROOR), is proposed to facilitate the multi-label feature selection.

Abstract

In the last decade, embedded multi-label feature selection methods, incorporating the search for feature subsets into model optimization, have attracted considerable attention in accurately evaluating the importance of features in multi-label classification tasks. Nevertheless, the state-of-the-art embedded multi-label feature selection algorithms based on least square regression usually cannot preserve sufficient discriminative information in multi-label data. To tackle the aforementioned challenge, a novel embedded multi-label feature selection method, termed global redundancy and relevance optimization in orthogonal regression (GRROOR), is proposed to facilitate the multi-label feature selection. The method employs orthogonal regression with feature weighting to retain sufficient statistical and structural information related to local label correlations of the multi-label data in the feature learning process. Additionally, both global feature redundancy and global label relevancy information have been considered in the orthogonal regression model, which could contribute to the search for discriminative and non-redundant feature subsets in the multi-label data. The cost function of GRROOR is an unbalanced orthogonal Procrustes problem on the Stiefel manifold. A simple yet effective scheme is utilized to obtain an optimal solution. Extensive experimental results on ten multi-label data sets demonstrate the effectiveness of GRROOR.
Paper Structure (26 sections, 40 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 26 sections, 40 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: The GRROOR framework consists of three sections: (a) exploring global feature redundancy; (b) exploiting global label correlations; (3) evaluating local label relevance.
  • Figure 2: Comparison results of multi-label feature selection methods in terms of redundancy, coverage, and hamming loss
  • Figure 3: Comparison results of multi-label feature selection methods in terms of average precision, macro-F1, and micro-F1
  • Figure 4: Multi-label classification performance with different number of selected features on the Slashdot data set: (a) Redundancy; (b) Coverage; (c) Hamming loss; (d) Average precision; (e) Macro-F1; (f)Micro-F1.
  • Figure 5: The Nemenyi test results ($C D = 4.2841$, $\alpha =0.05$): (a) Redundancy; (b) Coverage; (c) Hamming loss; (d) Average precision; (e) Macro-F1; (f) Micro-F1.
  • ...and 1 more figures