Table of Contents
Fetching ...

K-means Derived Unsupervised Feature Selection using Improved ADMM

Ziheng Sun, Chris Ding, Jicong Fan

TL;DR

This paper develops an alternating direction method of multipliers (ADMM) to solve the NP-hard optimization problem of the K-means UFS model and shows that the K-means UFS is more effective than the baselines in selecting features for clustering.

Abstract

Feature selection is important for high-dimensional data analysis and is non-trivial in unsupervised learning problems such as dimensionality reduction and clustering. The goal of unsupervised feature selection is finding a subset of features such that the data points from different clusters are well separated. This paper presents a novel method called K-means Derived Unsupervised Feature Selection (K-means UFS). Unlike most existing spectral analysis based unsupervised feature selection methods, we select features using the objective of K-means. We develop an alternating direction method of multipliers (ADMM) to solve the NP-hard optimization problem of our K-means UFS model. Extensive experiments on real datasets show that our K-means UFS is more effective than the baselines in selecting features for clustering.

K-means Derived Unsupervised Feature Selection using Improved ADMM

TL;DR

This paper develops an alternating direction method of multipliers (ADMM) to solve the NP-hard optimization problem of the K-means UFS model and shows that the K-means UFS is more effective than the baselines in selecting features for clustering.

Abstract

Feature selection is important for high-dimensional data analysis and is non-trivial in unsupervised learning problems such as dimensionality reduction and clustering. The goal of unsupervised feature selection is finding a subset of features such that the data points from different clusters are well separated. This paper presents a novel method called K-means Derived Unsupervised Feature Selection (K-means UFS). Unlike most existing spectral analysis based unsupervised feature selection methods, we select features using the objective of K-means. We develop an alternating direction method of multipliers (ADMM) to solve the NP-hard optimization problem of our K-means UFS model. Extensive experiments on real datasets show that our K-means UFS is more effective than the baselines in selecting features for clustering.

Paper Structure

This paper contains 18 sections, 37 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Visualization of $X_{h1}$ and $X_{h2}$ of a toy example
  • Figure 2: $\log_{10}(\Vert V\Vert_F)$ of Quadratic ADMM Eq. \ref{['eqn:L20_problem_ADMM']} and Bi-linear ADMM Eq. \ref{['eqn:L20_problem_ADMM_linear']}.
  • Figure 3: ACC and NMI of K-means UFS with different initial $\mu^0$ and $\rho$ on MicroMass data
  • Figure 4: Convergence curve of K-means UFS on MicroMass and HARS data ($\mu^0 = 0.1, \rho = 1.05, h = 50$)