Table of Contents
Fetching ...

Self-Supervised and Few-Shot Learning for Robust Bioaerosol Monitoring

Adrian Willi, Pascal Baumann, Sophie Erb, Fabian Gröger, Yanick Zeder, Simone Lionetti

TL;DR

It is shown that self-supervised learning and few-shot learning can be combined to classify holographic images of pollen grains using a large collection of unlabelled data and only a few identified particles per type.

Abstract

Real-time bioaerosol monitoring is improving the quality of life for people affected by allergies, but it often relies on deep-learning models which pose challenges for widespread adoption. These models are typically trained in a supervised fashion and require considerable effort to produce large amounts of annotated data, an effort that must be repeated for new particles, geographical regions, or measurement systems. In this work, we show that self-supervised learning and few-shot learning can be combined to classify holographic images of bioaerosol particles using a large collection of unlabelled data and only a few examples for each particle type. We first demonstrate that self-supervision on pictures of unidentified particles from ambient air measurements enhances identification even when labelled data is abundant. Most importantly, it greatly improves few-shot classification when only a handful of labelled images are available. Our findings suggest that real-time bioaerosol monitoring workflows can be substantially optimized, and the effort required to adapt models for different situations considerably reduced.

Self-Supervised and Few-Shot Learning for Robust Bioaerosol Monitoring

TL;DR

It is shown that self-supervised learning and few-shot learning can be combined to classify holographic images of pollen grains using a large collection of unlabelled data and only a few identified particles per type.

Abstract

Real-time bioaerosol monitoring is improving the quality of life for people affected by allergies, but it often relies on deep-learning models which pose challenges for widespread adoption. These models are typically trained in a supervised fashion and require considerable effort to produce large amounts of annotated data, an effort that must be repeated for new particles, geographical regions, or measurement systems. In this work, we show that self-supervised learning and few-shot learning can be combined to classify holographic images of bioaerosol particles using a large collection of unlabelled data and only a few examples for each particle type. We first demonstrate that self-supervision on pictures of unidentified particles from ambient air measurements enhances identification even when labelled data is abundant. Most importantly, it greatly improves few-shot classification when only a handful of labelled images are available. Our findings suggest that real-time bioaerosol monitoring workflows can be substantially optimized, and the effort required to adapt models for different situations considerably reduced.
Paper Structure (6 sections, 2 figures, 1 table)

This paper contains 6 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Simplified illustration of the SimCLR framework using holographic images of airborne particles from SwisensPoleno. Every input image in a batch is augmented, resulting in different views of the same particle. These views are then all passed through the same deep neural network encoder to obtain vector representations. Finally, representations from the same input image are attracted while the ones from different input images are repelled. This produces semantic representations without the need for labelled data.
  • Figure 2: Balanced accuracy for pollen classification as a function of the number of available labelled images per taxon, evaluated on data from the same SwisensPoleno (P5) used for (left), and on data from another SwisensPoleno (P4) (right). Results are shown for the linear classifier (top) and prototype learning (bottom) on top of features from ImageNet (teal) and ImageNet+SimCLR (purple). Colored bands represent the uncertainty due to the random choice of the labelled images.