Table of Contents
Fetching ...

Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator

Alexandre Luís Magalhães Levada, Frank Nielsen, Michel Ferreira Cardia Haddad

TL;DR

This work tackles the sensitivity of the $k$-NN classifier to the fixed neighborhood size by introducing kK-NN, which adapts the local neighborhood per sample using a curvature-based shape-operator estimate. The method computes Gaussian curvature from patches via a local covariance inverse and a Hessian-inspired construction to form a per-sample shape operator, guiding edge pruning in the $k$-NN graph. Empirical results on 30 real datasets show that kK-NN delivers higher balanced accuracy than standard $k$-NN and a competing adaptive approach, particularly when training data are scarce, though at higher computational cost. The study suggests promising directions for integrating curvature-aware geometry into classification and related tasks such as metric learning and dimensionality reduction.

Abstract

The $k$-nearest neighbor ($k$-NN) algorithm is one of the most popular methods for nonparametric classification. However, a relevant limitation concerns the definition of the number of neighbors $k$. This parameter exerts a direct impact on several properties of the classifier, such as the bias-variance tradeoff, smoothness of decision boundaries, robustness to noise, and class imbalance handling. In the present paper, we introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size. The rationale is that points with low curvature could have larger neighborhoods (locally, the tangent space approximates well the underlying data shape), whereas points with high curvature could have smaller neighborhoods (locally, the tangent space is a loose approximation). We estimate the local Gaussian curvature by computing an approximation to the local shape operator in terms of the local covariance matrix as well as the local Hessian matrix. Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method and also another adaptive $k$-NN algorithm. This is particularly evident when the number of samples in the training data is limited, suggesting that the $kK$-NN is capable of learning more discriminant functions with less data considering many relevant cases.

Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator

TL;DR

This work tackles the sensitivity of the -NN classifier to the fixed neighborhood size by introducing kK-NN, which adapts the local neighborhood per sample using a curvature-based shape-operator estimate. The method computes Gaussian curvature from patches via a local covariance inverse and a Hessian-inspired construction to form a per-sample shape operator, guiding edge pruning in the -NN graph. Empirical results on 30 real datasets show that kK-NN delivers higher balanced accuracy than standard -NN and a competing adaptive approach, particularly when training data are scarce, though at higher computational cost. The study suggests promising directions for integrating curvature-aware geometry into classification and related tasks such as metric learning and dimensionality reduction.

Abstract

The -nearest neighbor (-NN) algorithm is one of the most popular methods for nonparametric classification. However, a relevant limitation concerns the definition of the number of neighbors . This parameter exerts a direct impact on several properties of the classifier, such as the bias-variance tradeoff, smoothness of decision boundaries, robustness to noise, and class imbalance handling. In the present paper, we introduce a new adaptive -nearest neighbours (-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size. The rationale is that points with low curvature could have larger neighborhoods (locally, the tangent space approximates well the underlying data shape), whereas points with high curvature could have smaller neighborhoods (locally, the tangent space is a loose approximation). We estimate the local Gaussian curvature by computing an approximation to the local shape operator in terms of the local covariance matrix as well as the local Hessian matrix. Results on many real-world datasets indicate that the new -NN algorithm yields superior balanced accuracy compared to the established -NN method and also another adaptive -NN algorithm. This is particularly evident when the number of samples in the training data is limited, suggesting that the -NN is capable of learning more discriminant functions with less data considering many relevant cases.
Paper Structure (9 sections, 20 equations, 4 figures, 3 tables, 5 algorithms)

This paper contains 9 sections, 20 equations, 4 figures, 3 tables, 5 algorithms.

Figures (4)

  • Figure 1: The $k$-NNG of a dataset of handwritten digits with $n = 1797$ samples (graph nodes) and $k = \log_2 n$ neighbors.
  • Figure 2: The balanced accuracy curves built by the holdout strategy for the regular $k$-NN, adaptive $k$-NN lejeune2019adaptive, and the proposed $kK$-NN classifiers using different partition sizes, from 10% to 90% of the total number of samples. Top-left: vowel dataset. Top-right: Olivetti_Faces dataset. Bottom-left: ionosphere dataset. Bottom-right: parkinsons dataset.
  • Figure 3: The balanced accuracy curves built by the holdout strategy for the regular $k$-NN, adaptive $k$-NN lejeune2019adaptive, and the proposed $kK$-NN classifiers using different partition sizes, from 10% to 90% of the total number of samples. Top-left: UMIST_Faces_Cropped dataset. Top-right: variousCancers_final dataset. Bottom-left: micro-mass dataset. Bottom-right: GCM dataset.
  • Figure 4: The $k$-NN decision boundary and the Voronoi tesselation in an 2D feature space Voronoi

Theorems & Definitions (5)

  • Definition 1: Manifold
  • Definition 2: Tangent space
  • Definition 3: First fundamental form or metric tensor
  • Definition 4: Second fundamental form
  • Definition 5: Shape operator and curvatures