Table of Contents
Fetching ...

Group-Feature (Sensor) Selection With Controlled Redundancy Using Neural Networks

Aytijhya Saha, Nikhil R. Pal

TL;DR

The paper tackles supervised feature selection and group-feature (sensor) selection under controlled redundancy by embedding a redundancy-penalty and a generalized group-lasso penalty into a multilayer perceptron framework. It formalizes the loss as $E=E_0+\lambda P+\mu GL$, derives gradient-based update rules, and proves monotonicity and (weak/strong) convergence under smoothing, with practical validation on diverse datasets including RNA-Seq. Empirically, the method achieves competitive accuracy while substantially reducing the number of selected features or sensors and lowering redundancy, outperforming several neural and non-neural baselines in many cases. The approach offers a unified, efficient strategy for FS and GFS with potential applicability to other neural architectures and domains, enhancing interpretability and reducing computation.

Abstract

In this paper, we present a novel embedded feature selection method based on a Multi-layer Perceptron (MLP) network and generalize it for group-feature or sensor selection problems, which can control the level of redundancy among the selected features or groups. Additionally, we have generalized the group lasso penalty for feature selection to encompass a mechanism for selecting valuable group features while simultaneously maintaining a control over redundancy. We establish the monotonicity and convergence of the proposed algorithm, with a smoothed version of the penalty terms, under suitable assumptions. Experimental results on several benchmark datasets demonstrate the promising performance of the proposed methodology for both feature selection and group feature selection over some state-of-the-art methods.

Group-Feature (Sensor) Selection With Controlled Redundancy Using Neural Networks

TL;DR

The paper tackles supervised feature selection and group-feature (sensor) selection under controlled redundancy by embedding a redundancy-penalty and a generalized group-lasso penalty into a multilayer perceptron framework. It formalizes the loss as , derives gradient-based update rules, and proves monotonicity and (weak/strong) convergence under smoothing, with practical validation on diverse datasets including RNA-Seq. Empirically, the method achieves competitive accuracy while substantially reducing the number of selected features or sensors and lowering redundancy, outperforming several neural and non-neural baselines in many cases. The approach offers a unified, efficient strategy for FS and GFS with potential applicability to other neural architectures and domains, enhancing interpretability and reducing computation.

Abstract

In this paper, we present a novel embedded feature selection method based on a Multi-layer Perceptron (MLP) network and generalize it for group-feature or sensor selection problems, which can control the level of redundancy among the selected features or groups. Additionally, we have generalized the group lasso penalty for feature selection to encompass a mechanism for selecting valuable group features while simultaneously maintaining a control over redundancy. We establish the monotonicity and convergence of the proposed algorithm, with a smoothed version of the penalty terms, under suitable assumptions. Experimental results on several benchmark datasets demonstrate the promising performance of the proposed methodology for both feature selection and group feature selection over some state-of-the-art methods.
Paper Structure (19 sections, 24 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 19 sections, 24 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Variation in the proportion of selected features with the increase in $\lambda$, the penalty factor for redundancy, for six data sets. (b) Variation in the average classification accuracy with the increase in $\lambda$ for six data sets.
  • Figure 2: Variation of the norm of the weight vectors connecting input nodes to hidden nodes for the IRIS data set with iterations for $\lambda=5$ and $\mu=0$.
  • Figure 3: Variation of average accuracy is plotted with the number of selected features
  • Figure 4: Variation of the norm of the weight vectors connecting input nodes to hidden nodes for the LandSat data set with iterations for $\lambda=20$ and $\mu=1$.
  • Figure 5: (a) Variation of the proportions of selected features with the penalty parameter $\lambda$ for RNA-Seq 1 and RNA-Seq 2. (b) Variation of accuracy with the penalty parameter $\lambda$ for RNA-Seq 1 and RNA-Seq 2.
  • ...and 1 more figures