Table of Contents
Fetching ...

Flexible K Nearest Neighbors Classifier: Derivation and Application for Ion-mobility Spectrometry-based Indoor Localization

Philipp Müller

TL;DR

A new KNN-variant is proposed, which ensures that the K nearest neighbors are indeed close to the unlabelled sample and finds K along the way and achieves a higher classification accuracy than the KNN in the tests, while having the same computational demand.

Abstract

The K Nearest Neighbors (KNN) classifier is widely used in many fields such as fingerprint-based localization or medicine. It determines the class membership of unlabelled sample based on the class memberships of the K labelled samples, the so-called nearest neighbors, that are closest to the unlabelled sample. The choice of K has been the topic of various studies and proposed KNN-variants. Yet no variant has been proven to outperform all other variants. In this paper a KNN-variant is discussed which ensures that the K nearest neighbors are indeed close to the unlabelled sample and finds K along the way. The algorithm is tested and compared to the standard KNN in theoretical scenarios and for indoor localization based on ion-mobility spectrometry fingerprints. It achieves a higher classification accuracy than the KNN in the tests, while having the same computational demand.

Flexible K Nearest Neighbors Classifier: Derivation and Application for Ion-mobility Spectrometry-based Indoor Localization

TL;DR

A new KNN-variant is proposed, which ensures that the K nearest neighbors are indeed close to the unlabelled sample and finds K along the way and achieves a higher classification accuracy than the KNN in the tests, while having the same computational demand.

Abstract

The K Nearest Neighbors (KNN) classifier is widely used in many fields such as fingerprint-based localization or medicine. It determines the class membership of unlabelled sample based on the class memberships of the K labelled samples, the so-called nearest neighbors, that are closest to the unlabelled sample. The choice of K has been the topic of various studies and proposed KNN-variants. Yet no variant has been proven to outperform all other variants. In this paper a KNN-variant is discussed which ensures that the K nearest neighbors are indeed close to the unlabelled sample and finds K along the way. The algorithm is tested and compared to the standard KNN in theoretical scenarios and for indoor localization based on ion-mobility spectrometry fingerprints. It achieves a higher classification accuracy than the KNN in the tests, while having the same computational demand.
Paper Structure (9 sections, 1 equation, 3 figures, 1 algorithm)

This paper contains 9 sections, 1 equation, 3 figures, 1 algorithm.

Figures (3)

  • Figure 1: Illustration of the working principle of the Flexible $K$NN for two-dimensional data.Training samples from two classes are visualised by red crosses (class 1) and blue asterisks (class 2) in all three subfigures. The maximum allowed distance $d_\text{max}$ between test sample (black diamond symbol) is set to one and is visualized by the black circle around the test sample. The Flexible $K$NN classifies the test sample belonging to class 1 in (a) or class 2 in (b). For the $K$NN the yielded label depends on the choice of $K$. In subfigure (c) samples from a third class are visualized by grey circles. These samples are unknown to the classifier. The test sample clearly belongs to this unknown class. The Flexible $K$NN returns information that no label can be provided because no training sample lies within $d_\text{max}$. Contrary, the $K$NN would classify the test sample wrongly as belonging to either class 1 or 2 depending on the choice of $K$.
  • Figure 2: Classification accuracy of Flex$K$NN for varying $d_\text{max}$. The blue, solid line shows ratio of accurately classified test samples for which Flex$K$NN returned a label. The red, dashed line shows the ratio of test samples that were either correctly classified or for which Flex$K$NN did not yield a label because no training sample was within $d_\text{max}$. For comparison the classification accuracy of the $K$NN with $K=3$ is shown (black, dashed-dotted line).
  • Figure 3: Number of training samples $K$ inside $d_\text{max}$ for Flex$K$NN. Label of test sample is derived from labels of these $K$ training samples.