Table of Contents
Fetching ...

Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval

Leah Bar, Boaz Lerner, Nir Darshan, Rami Ben-Ari

TL;DR

The paper tackles interactive image retrieval under challenging open-set and imbalanced conditions with very few labeled examples. It introduces GAL, a Greedy Active Learning framework that combines a sample-wise impact value for linear and non-linear classifiers with a global GP-based impact measure, then builds batches greedily to balance uncertainty and diversity. The authors prove a $(1- frac{1}{e})$-approximation guarantee for the GP-based greedy strategy and demonstrate strong empirical gains across SVM, MLP, and GP on Paris-6K, Places, FSOD-IR, and MIRFLICKR-25K, with practical runtimes and a public code release. The approach advances interactive CBIR by enabling efficient cold-start learning and robust performance under open-set and imbalanced conditions, potentially benefiting other open-set AL applications. The combination of theoretical guarantees and broad empirical validation highlights GAL as a versatile toolkit for BMAL in challenging retrieval tasks.

Abstract

Active Learning (AL) is a user-interactive approach aimed at reducing annotation costs by selecting the most crucial examples to label. Although AL has been extensively studied for image classification tasks, the specific scenario of interactive image retrieval has received relatively little attention. This scenario presents unique characteristics, including an open-set and class-imbalanced binary classification, starting with very few labeled samples. We introduce a novel batch-mode Active Learning framework named GAL (Greedy Active Learning) that better copes with this application. It incorporates a new acquisition function for sample selection that measures the impact of each unlabeled sample on the classifier. We further embed this strategy in a greedy selection approach, better exploiting the samples within each batch. We evaluate our framework with both linear (SVM) and non-linear MLP/Gaussian Process classifiers. For the Gaussian Process case, we show a theoretical guarantee on the greedy approximation. Finally, we assess our performance for the interactive content-based image retrieval task on several benchmarks and demonstrate its superiority over existing approaches and common baselines. Code is available at https://github.com/barleah/GreedyAL.

Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval

TL;DR

The paper tackles interactive image retrieval under challenging open-set and imbalanced conditions with very few labeled examples. It introduces GAL, a Greedy Active Learning framework that combines a sample-wise impact value for linear and non-linear classifiers with a global GP-based impact measure, then builds batches greedily to balance uncertainty and diversity. The authors prove a -approximation guarantee for the GP-based greedy strategy and demonstrate strong empirical gains across SVM, MLP, and GP on Paris-6K, Places, FSOD-IR, and MIRFLICKR-25K, with practical runtimes and a public code release. The approach advances interactive CBIR by enabling efficient cold-start learning and robust performance under open-set and imbalanced conditions, potentially benefiting other open-set AL applications. The combination of theoretical guarantees and broad empirical validation highlights GAL as a versatile toolkit for BMAL in challenging retrieval tasks.

Abstract

Active Learning (AL) is a user-interactive approach aimed at reducing annotation costs by selecting the most crucial examples to label. Although AL has been extensively studied for image classification tasks, the specific scenario of interactive image retrieval has received relatively little attention. This scenario presents unique characteristics, including an open-set and class-imbalanced binary classification, starting with very few labeled samples. We introduce a novel batch-mode Active Learning framework named GAL (Greedy Active Learning) that better copes with this application. It incorporates a new acquisition function for sample selection that measures the impact of each unlabeled sample on the classifier. We further embed this strategy in a greedy selection approach, better exploiting the samples within each batch. We evaluate our framework with both linear (SVM) and non-linear MLP/Gaussian Process classifiers. For the Gaussian Process case, we show a theoretical guarantee on the greedy approximation. Finally, we assess our performance for the interactive content-based image retrieval task on several benchmarks and demonstrate its superiority over existing approaches and common baselines. Code is available at https://github.com/barleah/GreedyAL.

Paper Structure

This paper contains 22 sections, 22 equations, 16 figures, 7 tables, 2 algorithms.

Figures (16)

  • Figure 1: Main flow of the AL cycle. The top-K candidate set at cycle $t$ determined by the classifier $\mathcal{C}_{t}(\theta)$, can be selected as the pool from the unlabeled/search corpus. The AL module extracts a batch set $\mathcal{X}_b$ which is sent for annotation by a user (oracle) that generates the label set $\mathcal{Y}_b$. Based on the extended training set, a new classifier $\mathcal{C}_{t+1}(\theta)$ is trained for the next cycle.
  • Figure 2: Label proxy demonstration: The points are sampled from two Gaussian distributions, demonstrating the change in the decision boundary for two label options. Red and blue denote negative and positive labels, respectively. Bold and light points represent train and candidate samples, respectively, with their corresponding labels. The green dashed line represents the classifier based solely on the train set (bold circles). The blue and red lines signify the resulting classifier if the selected point (green circle) is labeled as blue or red. The blue classifier exhibits a lower deviation from the dashed green line, consistent with the true label (blue).
  • Figure 3: In a 2D Gaussian toy example, we illustrate a binary class scenario characterized by an imbalanced distribution of data, showcasing red samples representing irrelevant data and blue samples representing relevant data. We compare three fundamental selection strategies (a) Random, (b) Pure diversity (Kmeans++), and (c) Pure uncertainty (maximal entropy) to (d), the suggested GAL method. Initially, one relevant and 13 irrelevant samples are labeled. The initial SVM classifier is illustrated by a colored dashed line, followed by the corresponding solid line after updating the classifier with the addition of six samples ($B=6$). The dashed black line represents an "upper-bound", where the classifier is trained with all the data and their true labels. Notice the most significant improvement observed in the classifier with our GAL method, closing the gap toward the upper-bound and demonstrating a selection pattern that effectively combines diversity and uncertainty. The order of selection in GAL is depicted in (d) by $i_0$ to $i_5$, with corresponding impact scores of 1.75, 1.02, 0.80, 1.06, 0.59, and 0.66. Note that although $i_3$ and $i_5$ are close, they are on opposite sides of the classifier and close to the boundary. This means they have significant uncertainty measures and therefore a substantial impact on the decision boundary.
  • Figure 4: To calculate the score for a point $x_i$ in the candidate set, we train a classifier $\mathcal{C}(\theta^+_i)$ by assuming the sample is positive. Similarly, we train another classifier $\mathcal{C}(\theta^-_i)$ with a negative label. The impact value $\mathcal{S}_i$ is then determined as the minimum value obtained by applying a function $\mathcal{F}$ to both options \ref{['eq:label proxy']}.
  • Figure 5: In the SVM scenario, the GAL algorithm employs a binary tree structure. The initial point $x_{i_0}$ is chosen through the NEXT procedure (Algorithms \ref{['alg:greedy_gen']}). The red circles represent the results obtained from NEXT, which are based on the corresponding pseudo-labels.
  • ...and 11 more figures