Table of Contents
Fetching ...

Correlation Clustering of Organoid Images

Jannik Presberger, Rashmiparvathi Keshara, David Stein, Yung Hae Kim, Anne Grapin-Botton, Bjoern Andres

TL;DR

This work adopts models and algorithms for correlating organoid images, i.e., for quantifying the similarity in appearance and geometry of the organoids they depict, and for clustering organoid images by consolidating conflicting correlations.

Abstract

In biological and medical research, scientists now routinely acquire microscopy images of hundreds of morphologically heterogeneous organoids and are then faced with the task of finding patterns in the image collection, i.e., subsets of organoids that appear similar and potentially represent the same morphological class. We adopt models and algorithms for correlating organoid images, i.e., for quantifying the similarity in appearance and geometry of the organoids they depict, and for clustering organoid images by consolidating conflicting correlations. For correlating organoid images, we adopt and compare two alternatives, a partial quadratic assignment problem and a twin network. For clustering organoid images, we employ the correlation clustering problem. Empirically, we learn the parameters of these models, infer a clustering of organoid images, and quantify the accuracy of the inferred clusters, with respect to a training set and a test set we contribute of state-of-the-art light microscopy images of organoids clustered manually by biologists.

Correlation Clustering of Organoid Images

TL;DR

This work adopts models and algorithms for correlating organoid images, i.e., for quantifying the similarity in appearance and geometry of the organoids they depict, and for clustering organoid images by consolidating conflicting correlations.

Abstract

In biological and medical research, scientists now routinely acquire microscopy images of hundreds of morphologically heterogeneous organoids and are then faced with the task of finding patterns in the image collection, i.e., subsets of organoids that appear similar and potentially represent the same morphological class. We adopt models and algorithms for correlating organoid images, i.e., for quantifying the similarity in appearance and geometry of the organoids they depict, and for clustering organoid images by consolidating conflicting correlations. For correlating organoid images, we adopt and compare two alternatives, a partial quadratic assignment problem and a twin network. For clustering organoid images, we employ the correlation clustering problem. Empirically, we learn the parameters of these models, infer a clustering of organoid images, and quantify the accuracy of the inferred clusters, with respect to a training set and a test set we contribute of state-of-the-art light microscopy images of organoids clustered manually by biologists.
Paper Structure (36 sections, 3 theorems, 11 equations, 11 figures, 1 table)

This paper contains 36 sections, 3 theorems, 11 equations, 11 figures, 1 table.

Key Result

lemma 1

For any geometrically consistent $x \in X_{V_j V_k}$ and any $r^*, s, \gamma$ according to definition:consistency, we have $r^* = r_0^k$ and $s = \sigma_0^k / \sigma_0^j$.

Figures (11)

  • Figure 1: Depicted above are 130 images (scaled differently to the same size for illustration) of pancreatic progenitor organoids derived from human pluripotent stem cells. These organoids consist of cells expressing a nuclear Green Fluorescent Protein reporter for PDX1 (a pancreatic progenitor marker gene). After fixation, the organoids were stained with DAPI (blue) to mark the nucleus and Phalloidin (red) for F-Actin. Images were acquired using an automated spinning disc confocal microscope (20x objective of Yokogawa CV7000).
  • Figure 2: Depicted above, from left to right, are assignments (gray lines) between key points of five pairs of organoid images. Depicted in the bottom row are projections onto the first and the second image. Depicted in Columns 1--4 are assignments between morphologically similar organoids. Depicted in the last column is an assignment between dissimilar organoids. Note that distances between assigned key points are larger here. For illustration, images are rotated, and only key points of cell nuclei are shown.
  • Figure 3: In order to map a pair of scaled organoid images $z_j, z_k \in \mathbb{R}^{3 n^2}$ to a real number that is supposed to be positive for images in the same cluster and negative for images in distinct clusters, we learn a twin network consisting of a head $g_\mu \colon \mathbb{R}^{3n^2} \to \mathbb{R}^d$ in the form of a ResNet-18 He2016 with $d = 128$ and adjustable parameters $\mu \in \mathbb{R}^{11,242,176}$, and a base $h_\nu \colon \mathbb{R}^d \times \mathbb{R}^d \to \mathbb{R}$ in the form of one fully connected hidden layer and one output node, with adjustable parameters $\nu \in \mathbb{R}^{33,025}$.
  • Figure 4: Depicted above are precision recall curves for the independent classification of pairs of organoid images (left) and the clustering of organoid images (right), on the data sets Test-100 (top), Test-30 (middle) and Test-100/30 (bottom).
  • Figure 5: Depicted above is the variation of information distance between computed and true clusterings of organoid images as a function of a constant $\chi$ added to all cost coefficients of the correlation clustering problem. For this comparison, the costs from both the PQAP and the twin networks are scaled globally (not per instance) to $[-1,1]$, which does not alter the solutions.
  • ...and 6 more figures

Theorems & Definitions (7)

  • definition 1
  • lemma 1
  • proof
  • lemma 2: Motivation of $d'_{vw}$ and $d"_{vwv'w'}$
  • proof
  • lemma 3: Bound on cost
  • proof