Table of Contents
Fetching ...

The Berkeley Single Cell Computational Microscopy (BSCCM) Dataset

Henry Pinkard, Cherry Liu, Fanice Nyatigo, Daniel A. Fletcher, Laura Waller

TL;DR

BSCCM addresses reproducibility challenges in computational microscopy by providing a large, multi-modal single-cell dataset that links LED-array label-free imaging with fluorescence readouts and ground-truth cell-type labels. The approach combines physics-informed acquisition with data-driven analysis across multiple dataset variants (BSCCM, BSCCMNIST, BSCCM-coherent) and includes rich metadata and calibration to support robust benchmarking. Key contributions include large-scale, annotated ground-truth data across multiple illumination contrasts and a pipeline for fluorescence demixing and cross-modal alignment, enabling rigorous evaluation of reconstruction and phenotyping algorithms. This resource has practical biomedical impact by accelerating the development of cost-effective, robust computational microscopy tools for clinical and research applications.

Abstract

Computational microscopy, in which hardware and algorithms of an imaging system are jointly designed, shows promise for making imaging systems that cost less, perform more robustly, and collect new types of information. Often, the performance of computational imaging systems, especially those that incorporate machine learning, is sample-dependent. Thus, standardized datasets are an essential tool for comparing the performance of different approaches. Here, we introduce the Berkeley Single Cell Computational Microscopy (BSCCM) dataset, which contains over ~12,000,000 images of 400,000 of individual white blood cells. The dataset contains images captured with multiple illumination patterns on an LED array microscope and fluorescent measurements of the abundance of surface proteins that mark different cell types. We hope this dataset will provide a valuable resource for the development and testing of new algorithms in computational microscopy and computer vision with practical biomedical applications.

The Berkeley Single Cell Computational Microscopy (BSCCM) Dataset

TL;DR

BSCCM addresses reproducibility challenges in computational microscopy by providing a large, multi-modal single-cell dataset that links LED-array label-free imaging with fluorescence readouts and ground-truth cell-type labels. The approach combines physics-informed acquisition with data-driven analysis across multiple dataset variants (BSCCM, BSCCMNIST, BSCCM-coherent) and includes rich metadata and calibration to support robust benchmarking. Key contributions include large-scale, annotated ground-truth data across multiple illumination contrasts and a pipeline for fluorescence demixing and cross-modal alignment, enabling rigorous evaluation of reconstruction and phenotyping algorithms. This resource has practical biomedical impact by accelerating the development of cost-effective, robust computational microscopy tools for clinical and research applications.

Abstract

Computational microscopy, in which hardware and algorithms of an imaging system are jointly designed, shows promise for making imaging systems that cost less, perform more robustly, and collect new types of information. Often, the performance of computational imaging systems, especially those that incorporate machine learning, is sample-dependent. Thus, standardized datasets are an essential tool for comparing the performance of different approaches. Here, we introduce the Berkeley Single Cell Computational Microscopy (BSCCM) dataset, which contains over ~12,000,000 images of 400,000 of individual white blood cells. The dataset contains images captured with multiple illumination patterns on an LED array microscope and fluorescent measurements of the abundance of surface proteins that mark different cell types. We hope this dataset will provide a valuable resource for the development and testing of new algorithms in computational microscopy and computer vision with practical biomedical applications.
Paper Structure (31 sections, 3 equations, 10 figures, 2 tables)

This paper contains 31 sections, 3 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Berkeley Single Cell Computational Microscopy (BSCCM) dataset overview.a) Schematic of the microscope used in data collection: a commercial body fluorescence microscope with its trans-illumination lamp replaced with a programmable LED array quasi-dome. b) The LED array was used for label-free imaging of cells with different illumination patterns. c) The fluorescence light path was used capture 6-channel fluorescence images on the same cells. d) This provided both protein expression levels and label-free images on the same cells.
  • Figure 2: LED array and histology contrasts in BSCCM and BSCCM-coherent
  • Figure 3: LED array patterns used in the BSCCM dataset and the names used for them in the dataset
  • Figure 4: DPC and 6 channel fluorescence images in the all antibodies staining condition
  • Figure 5: Antibody staining conditons
  • ...and 5 more figures