Table of Contents
Fetching ...

Transductive One-Shot Learning Meet Subspace Decomposition

Kyle Stein, Andrew A. Mahyari, Guillermo Francia, Eman El-Sheikh

TL;DR

This paper addresses one-shot learning by proposing a transductive framework that uses subspace decomposition to transfer labels from a single labeled example to unlabeled query images. It jointly learns a Basis Matrix $W$ and a Coefficient Matrix $Y$ to decompose the combined embeddings ${\bf H}=f_\theta({\bf X})$ as ${\bf H}\approx {\bf W}{\bf Y}$, enforcing nonnegativity and softmax-based probability constraints. Labels propagate through the learned subspace by comparing coefficient representations, enabling accurate generalization to novel classes across multiple backbones and datasets, as demonstrated on mini-ImageNet and tiered-ImageNet with 1-shot and 10-way tasks. The results show improvements over state-of-the-art transductive OSL methods, and the paper discusses robustness, limitations related to feature-space size, and directions for extending the approach to zero-shot learning by linking primitives to semantic attributes.

Abstract

One-shot learning focuses on adapting pretrained models to recognize newly introduced and unseen classes based on a single labeled image. While variations of few-shot and zero-shot learning exist, one-shot learning remains a challenging yet crucial problem due to its ability to generalize knowledge to unseen classes from just one human-annotated image. In this paper, we introduce a transductive one-shot learning approach that employs subspace decomposition to utilize the information from labeled images in the support set and unlabeled images in the query set. These images are decomposed into a linear combination of latent variables representing primitives captured by smaller subspaces. By representing images in the query set as linear combinations of these latent primitives, we can propagate the label from a single image in the support set to query images that share similar combinations of primitives. Through a comprehensive quantitative analysis across various neural network feature extractors and datasets, we demonstrate that our approach can effectively generalize to novel classes from just one labeled image.

Transductive One-Shot Learning Meet Subspace Decomposition

TL;DR

This paper addresses one-shot learning by proposing a transductive framework that uses subspace decomposition to transfer labels from a single labeled example to unlabeled query images. It jointly learns a Basis Matrix and a Coefficient Matrix to decompose the combined embeddings as , enforcing nonnegativity and softmax-based probability constraints. Labels propagate through the learned subspace by comparing coefficient representations, enabling accurate generalization to novel classes across multiple backbones and datasets, as demonstrated on mini-ImageNet and tiered-ImageNet with 1-shot and 10-way tasks. The results show improvements over state-of-the-art transductive OSL methods, and the paper discusses robustness, limitations related to feature-space size, and directions for extending the approach to zero-shot learning by linking primitives to semantic attributes.

Abstract

One-shot learning focuses on adapting pretrained models to recognize newly introduced and unseen classes based on a single labeled image. While variations of few-shot and zero-shot learning exist, one-shot learning remains a challenging yet crucial problem due to its ability to generalize knowledge to unseen classes from just one human-annotated image. In this paper, we introduce a transductive one-shot learning approach that employs subspace decomposition to utilize the information from labeled images in the support set and unlabeled images in the query set. These images are decomposed into a linear combination of latent variables representing primitives captured by smaller subspaces. By representing images in the query set as linear combinations of these latent primitives, we can propagate the label from a single image in the support set to query images that share similar combinations of primitives. Through a comprehensive quantitative analysis across various neural network feature extractors and datasets, we demonstrate that our approach can effectively generalize to novel classes from just one labeled image.

Paper Structure

This paper contains 9 sections, 3 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Overall architecture of our approach for transductive one-shot learning. A pre-trained CNN extracts the features from the images, forming an embedding matrix. This matrix is then decomposed into a Basis Matrix and a Coefficient Matrix. The Basis Matrix contains fundamental class primitives, while the Coefficient Matrix encodes how these primitives combine to form image embeddings. The optimization process iteratively refines these matrices to minimize the reconstruction error. Finally, the Coefficient Matrix is used to propagate labels from the support set to the query set by classifying images with similar primitives.