On high-dimensional modifications of the nearest neighbor classifier

Annesha Ghosh; Deep Ghoshal; Bilol Banerjee; Anil K. Ghosh

On high-dimensional modifications of the nearest neighbor classifier

Annesha Ghosh, Deep Ghoshal, Bilol Banerjee, Anil K. Ghosh

TL;DR

This article discusses some existing nonparametric classifier methods and proposes some new ones, and carries out some theoretical investigations and analyzes several simulated and benchmark datasets to compare the empirical performances of proposed methods with some of the existing ones.

Abstract

Nearest neighbor classifier is arguably the most simple and popular nonparametric classifier available in the literature. However, due to the concentration of pairwise distances and the violation of the neighborhood structure, this classifier often suffers in high-dimension, low-sample size (HDLSS) situations, especially when the scale difference between the competing classes dominates their location difference. Several attempts have been made in the literature to take care of this problem. In this article, we discuss some of these existing methods and propose some new ones. We carry out some theoretical investigations in this regard and analyze several simulated and benchmark datasets to compare the empirical performances of proposed methods with some of the existing ones.

On high-dimensional modifications of the nearest neighbor classifier

TL;DR

Abstract

Paper Structure (7 sections, 3 theorems, 12 equations, 12 figures, 1 table)

This paper contains 7 sections, 3 theorems, 12 equations, 12 figures, 1 table.

Introduction
Modified Scale-adjusted Nearest Neighbor Classifier
Nearest neighbor classification using distance-based features
Classification based on minimum distances
Clssification based on multiple neighbors
Results from the analysis of benchmark datasets
Concluding remarks

Key Result

Theorem 1

If $J$ competing classes satisfy assumptions (A1)-(A3), and there are at least two observations from each of them (i.e, $n_j\ge 2$ for all $j=1,2,\ldots, J$), then we have the following results.

Figures (12)

Figure 1: Misclassification rates of Bayes, NN, CH and MCH classifiers in Examples 1-3.
Figure 2: Misclassification rates of Bayes, NN, CH, and MCH classifiers in Examples 4-6.
Figure 3: Scatter plots of training (top row) and test (bottom row) samples along with the class boundaries estimated by NN, CH, MCH, and MDist classifiers in Example 4.
Figure 4: Scatter plots of the test samples and the class boundaries estimated by NN, CH, MCH and MDist classifiers in Example 5.
Figure 5: Scatter plots of the test samples and the class boundaries estimated by NN, CH, MCH and MDist classifiers in Example 6.
...and 7 more figures

Theorems & Definitions (3)

Theorem 1
Theorem 2
Theorem 3

On high-dimensional modifications of the nearest neighbor classifier

TL;DR

Abstract

On high-dimensional modifications of the nearest neighbor classifier

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (3)