Table of Contents
Fetching ...

Class maps for visualizing classification results

Jakob Raymaekers, Peter J. Rousseeuw, Mia Hubert

TL;DR

The goal is to visualize aspects of the classification results to obtain insight in the data, and the display is constructed for discriminant analysis, the k-nearest neighbor classifier, support vector machines, logistic regression, and coupling pairwise classifications.

Abstract

Classification is a major tool of statistics and machine learning. A classification method first processes a training set of objects with given classes (labels), with the goal of afterward assigning new objects to one of these classes. When running the resulting prediction method on the training data or on test data, it can happen that an object is predicted to lie in a class that differs from its given label. This is sometimes called label bias, and raises the question whether the object was mislabeled. The proposed class map reflects the probability that an object belongs to an alternative class, how far it is from the other objects in its given class, and whether some objects lie far from all classes. The goal is to visualize aspects of the classification results to obtain insight in the data. The display is constructed for discriminant analysis, the k-nearest neighbor classifier, support vector machines, logistic regression, and coupling pairwise classifications. It is illustrated on several benchmark datasets, including some about images and texts.

Class maps for visualizing classification results

TL;DR

The goal is to visualize aspects of the classification results to obtain insight in the data, and the display is constructed for discriminant analysis, the k-nearest neighbor classifier, support vector machines, logistic regression, and coupling pairwise classifications.

Abstract

Classification is a major tool of statistics and machine learning. A classification method first processes a training set of objects with given classes (labels), with the goal of afterward assigning new objects to one of these classes. When running the resulting prediction method on the training data or on test data, it can happen that an object is predicted to lie in a class that differs from its given label. This is sometimes called label bias, and raises the question whether the object was mislabeled. The proposed class map reflects the probability that an object belongs to an alternative class, how far it is from the other objects in its given class, and whether some objects lie far from all classes. The goal is to visualize aspects of the classification results to obtain insight in the data. The display is constructed for discriminant analysis, the k-nearest neighbor classifier, support vector machines, logistic regression, and coupling pairwise classifications. It is illustrated on several benchmark datasets, including some about images and texts.

Paper Structure

This paper contains 17 sections, 20 equations, 30 figures, 5 tables.

Figures (30)

  • Figure 1: A stacked mosaic plot of a classification of the floral bud data. The given classes (labels) are on the horizontal axis, and the predicted classes are on the vertical axis. The area of each rectangle is proportional to the number of objects in it.
  • Figure 2: Example of a class map.
  • Figure 3: Floral buds data: maps of all four classes.
  • Figure 4: Top row: randomly sampled digits; bottom row: averaged images per class.
  • Figure 5: MNIST data: stacked mosaic plot where the objects flagged as outliers are shown in dark grey, as an extra class at the top.
  • ...and 25 more figures