Table of Contents
Fetching ...

Discriminative Subspace Emersion from learning feature relevances across different populations

Marco Canducci, Lida Abdi, Alessandro Prete, Roland J. Veen, Michael Biehl, Wiebke Arlt, Peter Tino

TL;DR

Theoretical and empirical investigations over synthetic and real-world datasets indicate that DSE accurately identifies a common subspace for the classification across different populations, and is shown to be true for a surprisingly high degree of overlap between classes.

Abstract

In a given classification task, the accuracy of the learner is often hampered by finiteness of the training set, high-dimensionality of the feature space and severe overlap between classes. In the context of interpretable learners, with (piecewise) linear separation boundaries, these issues can be mitigated by careful construction of optimization procedures and/or estimation of relevant features for the task. However, when the task is shared across two disjoint populations the main interest is shifted towards estimating a set of features that discriminate the most between the two, when performing classification. We propose a new Discriminative Subspace Emersion (DSE) method to extend subspace learning toward a general relevance learning framework. DSE allows us to identify the most relevant features in distinguishing the classification task across two populations, even in cases of high overlap between classes. The proposed methodology is designed to work with multiple sets of labels and is derived in principle without being tied to a specific choice of base learner. Theoretical and empirical investigations over synthetic and real-world datasets indicate that DSE accurately identifies a common subspace for the classification across different populations. This is shown to be true for a surprisingly high degree of overlap between classes.

Discriminative Subspace Emersion from learning feature relevances across different populations

TL;DR

Theoretical and empirical investigations over synthetic and real-world datasets indicate that DSE accurately identifies a common subspace for the classification across different populations, and is shown to be true for a surprisingly high degree of overlap between classes.

Abstract

In a given classification task, the accuracy of the learner is often hampered by finiteness of the training set, high-dimensionality of the feature space and severe overlap between classes. In the context of interpretable learners, with (piecewise) linear separation boundaries, these issues can be mitigated by careful construction of optimization procedures and/or estimation of relevant features for the task. However, when the task is shared across two disjoint populations the main interest is shifted towards estimating a set of features that discriminate the most between the two, when performing classification. We propose a new Discriminative Subspace Emersion (DSE) method to extend subspace learning toward a general relevance learning framework. DSE allows us to identify the most relevant features in distinguishing the classification task across two populations, even in cases of high overlap between classes. The proposed methodology is designed to work with multiple sets of labels and is derived in principle without being tied to a specific choice of base learner. Theoretical and empirical investigations over synthetic and real-world datasets indicate that DSE accurately identifies a common subspace for the classification across different populations. This is shown to be true for a surprisingly high degree of overlap between classes.

Paper Structure

This paper contains 19 sections, 31 equations, 5 figures.

Figures (5)

  • Figure 1: (a): Two-dimensional representation of the two considered populations ("A" and "B") with two classes ("Disease" and "No Disease"). (b): Phase 1 - Case 1 (top) and Case 2 (center), with uncertainty in the estimation of a discriminative hyperplane as a gray band. The bottom panel shows Phase 2 classification of the relevances estimated in Case 1 (diamonds) and Case 2 (asterisks), relative to 100 classifiers trained in each Case.
  • Figure 2: AUC of classification tasks for Phase 1 - Case 1 (cyan), Case 2 (orange) and Phase 2 (yellow) at varying separation $t$ with GMLVQ (panel \ref{['subfig:AUC_withT_GMLVQ']}) and SVM \ref{['subfig:AUC_withT_SVM']}. The vertical bars show the mean and standard deviation of the results over 100 trials in each Phase and Case.
  • Figure 3: Feature relevance vectors with GMLVQ (a) and SVM (b) as base learners, two-dimensional embedding of samples obtained with GMLVQ (b) for different values of $t$ for the synthetic data for DSE. In each plot, top row $t = 0.01$ and bottom row $t = 0.25$; Column 1: Phase 1 - Case 1; Column2: Phase 1 - Case 2; Column 3: Phase 2.
  • Figure 4: Pessimistic (yellow) , Optimistic (magenta) and Experimental (purple) separations (in log scale) for two different dimensions $d$; $d = 5$ (left panel) and $d = 20$ (right panel).
  • Figure 5: Results of the experiment for adrenal tumours data on two considered populations, population A and population B, for the given health condition. (Top row): Two-dimensional embeddings in Phase 1 - Case 1 / Case 2 and Phase 2 (left to right); (Middle row): ROC of Phase 1 - Case 1 / Case 2 and Phase 2 (left to right); (bottom panel): Sorted feature relevance vectors in Phase 2.