Table of Contents
Fetching ...

Active Learning Classification from a Signal Separation Perspective

Hrushikesh Mhaskar, Ryan O'Dowd, Efstratios Tsoukanis

TL;DR

This work treats classification as a signal-separation task and develops a geometry-aware, kernel-based framework that isolates class supports under overlap. It leverages a localized kernel on the sphere, an estimator F_{n,M}, and a partitioning theorem to identify dense regions corresponding to classes, then employs an iterative active-learning loop with graph-based clustering and label propagation. The SCALe method achieves high accuracy with very few labeled examples on hyperspectral benchmarks, including 96.04% on Salinas with 3% labels and 81.46% on Indian Pines with 7.5% labels, indicating strong data-efficiency. The approach offers a principled, generalizable pathway for labeling-efficient classification and motivates cross-domain applications.

Abstract

In machine learning, classification is usually seen as a function approximation problem, where the goal is to learn a function that maps input features to class labels. In this paper, we propose a novel clustering and classification framework inspired by the principles of signal separation. This approach enables efficient identification of class supports, even in the presence of overlapping distributions. We validate our method on real-world hyperspectral datasets Salinas and Indian Pines. The experimental results demonstrate that our method is competitive with the state of the art active learning algorithms by using a very small subset of data set as training points.

Active Learning Classification from a Signal Separation Perspective

TL;DR

This work treats classification as a signal-separation task and develops a geometry-aware, kernel-based framework that isolates class supports under overlap. It leverages a localized kernel on the sphere, an estimator F_{n,M}, and a partitioning theorem to identify dense regions corresponding to classes, then employs an iterative active-learning loop with graph-based clustering and label propagation. The SCALe method achieves high accuracy with very few labeled examples on hyperspectral benchmarks, including 96.04% on Salinas with 3% labels and 81.46% on Indian Pines with 7.5% labels, indicating strong data-efficiency. The approach offers a principled, generalizable pathway for labeling-efficient classification and motivates cross-domain applications.

Abstract

In machine learning, classification is usually seen as a function approximation problem, where the goal is to learn a function that maps input features to class labels. In this paper, we propose a novel clustering and classification framework inspired by the principles of signal separation. This approach enables efficient identification of class supports, even in the presence of overlapping distributions. We validate our method on real-world hyperspectral datasets Salinas and Indian Pines. The experimental results demonstrate that our method is competitive with the state of the art active learning algorithms by using a very small subset of data set as training points.

Paper Structure

This paper contains 5 sections, 1 theorem, 16 equations, 2 figures, 1 algorithm.

Key Result

Theorem 1

Let $\mu^*$ be a probability measure with a fine structure given by parameter $\eta$ and $S\ge q+2$ be an integer. Let $M \ge c_3n^\alpha \log(n)$ and $\{x_1, x_2, \ldots, x_M\}$ be independent samples from $\mu^*$. There exists $r(\Theta) \sim \Theta^{-1/(S-\alpha)}$ such that with probability at l Moreover, if $n>2r(\Theta)/\eta$ there exists a partition $\{\mathcal{G}_{k,\eta,n}(\Theta)\}_{k=1}

Figures (2)

  • Figure 1: Salinas dataset.
  • Figure 2: Indian Pines dataset.

Theorems & Definitions (1)

  • Theorem 1