Table of Contents
Fetching ...

Cycle-Consistent Multi-Graph Matching for Self-Supervised Annotation of C.Elegans

Christoph Karg, Sebastian Stricker, Lisa Hutschenreiter, Bogdan Savchynskyy, Dagmar Kainmueller

TL;DR

This work presents a novel approach for unsupervised multi-graph matching, which applies to problems for which a Gaussian distribution of keypoint features can be assumed, and yields the first unsupervised atlas of C. elegans, a model of the joint distribution of all of its cell nuclei without the need for any ground truth cell annotation.

Abstract

In this work we present a novel approach for unsupervised multi-graph matching, which applies to problems for which a Gaussian distribution of keypoint features can be assumed. We leverage cycle consistency as loss for self-supervised learning, and determine Gaussian parameters through Bayesian Optimization, yielding a highly efficient approach that scales to large datasets. Our fully unsupervised approach enables us to reach the accuracy of state-of-the-art supervised methodology for the biomedical use case of semantic cell annotation in 3D microscopy images of the worm C. elegans. To this end, our approach yields the first unsupervised atlas of C. elegans, i.e. a model of the joint distribution of all of its cell nuclei, without the need for any ground truth cell annotation. This advancement enables highly efficient semantic annotation of cells in large microscopy datasets, overcoming a current key bottleneck. Beyond C. elegans, our approach offers fully unsupervised construction of cell-level atlases for any model organism with a stereotyped body plan down to the level of unique semantic cell labels, and thus bears the potential to catalyze respective biomedical studies in a range of further species.

Cycle-Consistent Multi-Graph Matching for Self-Supervised Annotation of C.Elegans

TL;DR

This work presents a novel approach for unsupervised multi-graph matching, which applies to problems for which a Gaussian distribution of keypoint features can be assumed, and yields the first unsupervised atlas of C. elegans, a model of the joint distribution of all of its cell nuclei without the need for any ground truth cell annotation.

Abstract

In this work we present a novel approach for unsupervised multi-graph matching, which applies to problems for which a Gaussian distribution of keypoint features can be assumed. We leverage cycle consistency as loss for self-supervised learning, and determine Gaussian parameters through Bayesian Optimization, yielding a highly efficient approach that scales to large datasets. Our fully unsupervised approach enables us to reach the accuracy of state-of-the-art supervised methodology for the biomedical use case of semantic cell annotation in 3D microscopy images of the worm C. elegans. To this end, our approach yields the first unsupervised atlas of C. elegans, i.e. a model of the joint distribution of all of its cell nuclei, without the need for any ground truth cell annotation. This advancement enables highly efficient semantic annotation of cells in large microscopy datasets, overcoming a current key bottleneck. Beyond C. elegans, our approach offers fully unsupervised construction of cell-level atlases for any model organism with a stereotyped body plan down to the level of unique semantic cell labels, and thus bears the potential to catalyze respective biomedical studies in a range of further species.

Paper Structure

This paper contains 31 sections, 12 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Traditional approach: Supervised worm-to-atlas matching. Left: Five exemplary worms, consistently composed from 558 cells; each cell is expert-annotated with its unique semantic name, indicated by 558 distinct colors (note, not all colors can be distinguished by eye). Such training data serves for supervised learning of a statistical atlas (top right, atlas labels illustrated by 3D ellipsoids representing the positional covariances $\Sigma_i^{\text{cen}}$ of individual cells). A target worm (bottom right, cell instance segmentation) can then be semantically annotated by solving a graph matching problem to optimally assign atlas labels to target cell instances. For computational efficiency, the problem can be sparsified by restricting the set of possible assignments for each atlas label to a reduced subset of segments (colored ovals).
  • Figure 2: Our approach: Cycle-consistent multi-graph matching for self-supervised atlas learning. Given a set of worms (instance segmentations but no semantic labels; three examples shown), establishing cycle consistent correspondences for all cells across all worms yields cell cliques that effectively serve to replace semantic annotations (see green example for one cell). Determining cell cliques can be phrased as a multi-graph matching problem, which extends pairwise matching problems to include cycle consistency constraints. Cycle consistency is necessary but not sufficient for correctness. Vice-versa, inconsistency entails error (orange example). We thus leverage cycle consistency as self-supervisory signal to learn the parameters of a graph matching objective via Bayesian Optimization, yielding the first unsupervised statistical atlas of C. elegans. This unsupervised atlas can then be plugged into a standard worm-to-atlas matching objective.
  • Figure 3: Atlas accuracy as a function of the training set size for different modes of the MGM solver, see \ref{['sec:used-mgm-solvers']}.
  • Figure 4: Impact of atlas construction for varying training set sizes $N \in \{{10, 20, \dots, 100}\}$. Pre-atlas accuracy evaluates the accuracy of the MGM solution when matching the $N$ training worms to themselves. Unsupervised atlas accuracy shows the average accuracy when matching the same set of $N$ worms to the atlas built from their MGM solution.
  • Figure 5: Accuracies of the unsupervised atlas as a function of the training set size, evaluated on all 100 test worms. As an upper bound, accuracy of our supervised atlas built from 100 training worms is shown.
  • ...and 3 more figures