Table of Contents
Fetching ...

DW-KNN: A Transparent Local Classifier Integrating Distance Consistency and Neighbor Reliability

Kumarjit Pathak, Karthik K, Sachin Madan, Jitin Kapila

TL;DR

DW-KNN tackles reliability and interpretability gaps in KNN by introducing a dual weighting scheme that combines class-wise distance pooling with neighbor validity. The method yields robust accuracy and stability across nine datasets, with statistically significant improvements over select baselines and strong performance on challenging real-world tasks. It preserves interpretability through instance-level validity signals and maintains competitive parity with ensemble approaches, while highlighting trade-offs in minority-class recall and computational cost. Overall, DW-KNN provides a practical, explainable alternative to complex metric-learning methods for reliability-aware nearest-neighbor classification.

Abstract

K-Nearest Neighbors (KNN) is one of the most used ML classifiers. However, if we observe closely, standard distance-weighted KNN and relative variants assume all 'k' neighbors are equally reliable. In heterogeneous feature space, this becomes a limitation that hinders reliability in predicting true levels of the observation. We propose DW-KNN (Double Weighted KNN), a transparent and robust variant that integrates exponential distance with neighbor validity. This enables instance-level interpretability, suppresses noisy or mislabeled samples, and reduces hyperparameter sensitivity. Comprehensive evaluation on 9 data-sets helps to demonstrate that DW-KNN achieves 0.8988 accuracy on average. It ranks 2nd among six methods and within 0.2% of the best-performing Ensemble KNN. It also exhibits the lowest cross-validation variance (0.0156), indicating reliable prediction stability. Statistical significance test confirmed ($p < 0.001$) improvement over compactness weighted KNN (+4.09\%) and Kernel weighted KNN (+1.13\%). The method provides a simple yet effective alternative to complex adaptive schemes, particularly valuable for high-stakes applications requiring explainable predictions.

DW-KNN: A Transparent Local Classifier Integrating Distance Consistency and Neighbor Reliability

TL;DR

DW-KNN tackles reliability and interpretability gaps in KNN by introducing a dual weighting scheme that combines class-wise distance pooling with neighbor validity. The method yields robust accuracy and stability across nine datasets, with statistically significant improvements over select baselines and strong performance on challenging real-world tasks. It preserves interpretability through instance-level validity signals and maintains competitive parity with ensemble approaches, while highlighting trade-offs in minority-class recall and computational cost. Overall, DW-KNN provides a practical, explainable alternative to complex metric-learning methods for reliability-aware nearest-neighbor classification.

Abstract

K-Nearest Neighbors (KNN) is one of the most used ML classifiers. However, if we observe closely, standard distance-weighted KNN and relative variants assume all 'k' neighbors are equally reliable. In heterogeneous feature space, this becomes a limitation that hinders reliability in predicting true levels of the observation. We propose DW-KNN (Double Weighted KNN), a transparent and robust variant that integrates exponential distance with neighbor validity. This enables instance-level interpretability, suppresses noisy or mislabeled samples, and reduces hyperparameter sensitivity. Comprehensive evaluation on 9 data-sets helps to demonstrate that DW-KNN achieves 0.8988 accuracy on average. It ranks 2nd among six methods and within 0.2% of the best-performing Ensemble KNN. It also exhibits the lowest cross-validation variance (0.0156), indicating reliable prediction stability. Statistical significance test confirmed () improvement over compactness weighted KNN (+4.09\%) and Kernel weighted KNN (+1.13\%). The method provides a simple yet effective alternative to complex adaptive schemes, particularly valuable for high-stakes applications requiring explainable predictions.

Paper Structure

This paper contains 53 sections, 5 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Sensitivity analysis for validity neighborhood size ($K_v$). Performance remains stable across $K_v \in$ [5, 20] on most datasets.
  • Figure 2: Sensitivity analysis for exponential decay parameter ($\gamma$). Performance varies less than 2% across two orders of magnitude.
  • Figure 3: Decision boundaries on linearly separable data showing clean separation without overfitting.
  • Figure 4: Decision boundaries on Imbalanced Blobs dataset (1:3 ratio, linearly separable). All methods achieve perfect accuracy with identical linear separators.
  • Figure 5: Decision boundaries on overlapping classes. DW-KNN produces smooth boundaries without fragmentation.
  • ...and 1 more figures