Spectral clustering algorithm for the allometric extension model

Kohei Kawamoto; Yuichi Goto; Koji Tsukuda

Spectral clustering algorithm for the allometric extension model

Kohei Kawamoto, Yuichi Goto, Koji Tsukuda

TL;DR

This paper addresses binary clustering under an allometric extension model in which the leading directions of the two covariances and the mean difference are aligned. It derives a non-asymptotic bound on the misclassification probability of a spectral clustering algorithm that uses the top eigenvector of the sample covariance and demonstrates high-dimensional consistency as n and m grow under suitable conditions. The analysis relies on sub-Gaussian data properties, concentration of the sample covariance around its expectation, and eigenvector perturbation bounds to relate the estimated and population eigenvectors, with the signal-to-noise ratio eta = ||mu||^2 / max_j lambda1(Sigma_j) governing performance. The results extend spectral clustering theory beyond homoscedastic assumptions and provide finite-sample guarantees for clustering under heteroscedastic allometric relations.

Abstract

The spectral clustering algorithm is often used as a binary clustering method for unclassified data by applying the principal component analysis. To study theoretical properties of the algorithm, the assumption of conditional homoscedasticity is often supposed in existing studies. However, this assumption is restrictive and often unrealistic in practice. Therefore, in this paper, we consider the allometric extension model, that is, the directions of the first eigenvectors of two covariance matrices and the direction of the difference of two mean vectors coincide, and we provide a non-asymptotic bound of the error probability of the spectral clustering algorithm for the allometric extension model. As a byproduct of the result, we obtain the consistency of the clustering method in high-dimensional settings.

Spectral clustering algorithm for the allometric extension model

TL;DR

Abstract

Paper Structure (14 sections, 7 theorems, 80 equations)

This paper contains 14 sections, 7 theorems, 80 equations.

Introduction
Spectral clustering algorithm
Notation
Organization of the paper
Allometric extension model
Spectral clustering algorithm for the allometric extension model
Proofs
Proof of Proposition \ref{['propae']}
Proof of Proposition \ref{['kex']}
Proof of Proposition \ref{['evs']}
Proof of Theorem \ref{['mthm']}
Proof of Lemma \ref{['lemk']}
Proof of Corollary \ref{['col']}
Concluding remarks

Key Result

Proposition 1

Under the conditions stated above, where the sign of $\boldsymbol{\gamma}_1(\boldsymbol{\Sigma})$ is appropriately chosen in AE1.

Theorems & Definitions (15)

Proposition 1
Remark 1
Remark 2
Proposition 2
Proposition 3
Remark 3
Remark 4
Theorem 4
Remark 5
Corollary 5
...and 5 more

Spectral clustering algorithm for the allometric extension model

TL;DR

Abstract

Spectral clustering algorithm for the allometric extension model

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (15)