Table of Contents
Fetching ...

Reproduction of IVFS algorithm for high-dimensional topology preservation feature selection

Zihan Wang

TL;DR

This paper reproduces the IVFS algorithm introduced in AAAI 2020, which is inspired by the random subset method and preserves data similarity by maintaining topological structure and demonstrates that it outperforms SPEC and MCFS on most datasets.

Abstract

Feature selection is a crucial technique for handling high-dimensional data. In unsupervised scenarios, many popular algorithms focus on preserving the original data structure. In this paper, we reproduce the IVFS algorithm introduced in AAAI 2020, which is inspired by the random subset method and preserves data similarity by maintaining topological structure. We systematically organize the mathematical foundations of IVFS and validate its effectiveness through numerical experiments similar to those in the original paper. The results demonstrate that IVFS outperforms SPEC and MCFS on most datasets, although issues with its convergence and stability persist.

Reproduction of IVFS algorithm for high-dimensional topology preservation feature selection

TL;DR

This paper reproduces the IVFS algorithm introduced in AAAI 2020, which is inspired by the random subset method and preserves data similarity by maintaining topological structure and demonstrates that it outperforms SPEC and MCFS on most datasets.

Abstract

Feature selection is a crucial technique for handling high-dimensional data. In unsupervised scenarios, many popular algorithms focus on preserving the original data structure. In this paper, we reproduce the IVFS algorithm introduced in AAAI 2020, which is inspired by the random subset method and preserves data similarity by maintaining topological structure. We systematically organize the mathematical foundations of IVFS and validate its effectiveness through numerical experiments similar to those in the original paper. The results demonstrate that IVFS outperforms SPEC and MCFS on most datasets, although issues with its convergence and stability persist.
Paper Structure (16 sections, 11 equations, 13 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 11 equations, 13 figures, 2 tables, 1 algorithm.

Figures (13)

  • Figure 1: $L_2$ norm v.s. the number of features using Lymphoma dataset.
  • Figure 2: $L_2$ norm v.s. the number of features using Orlraws10P dataset.
  • Figure 3: $L_2$ norm v.s. the number of features using Pixraw10P dataset.
  • Figure 4: $L_2$ norm v.s. the number of features using Prostate-GE dataset.
  • Figure 5: $L_2$ norm v.s. the number of features using SMK-CAN-187 dataset.
  • ...and 8 more figures