Table of Contents
Fetching ...

qlty: handling large tensors in scientific imaging

Petrus Zwart

TL;DR

qlty tackles memory bottlenecks in large volumetric scientific imaging by providing out-of-core tensor management with patch-based subsampling, edge-aware augmentation, and weighted stitching. The method uses sliding windows over tensors with shapes such as $$(N,C,Y,X)$$ to generate $$(M,C,Y_w,X_w)$$, complemented by border-aware cleaning and $zarr$-stored mean/normalization arrays processed via $dask$ to fuse results. A 3D tomographic segmentation example demonstrates practical viability, including data duplication from patching, an eight-model SMSNet ensemble, and coherent full-volume inference. By integrating with PyTorch and parallel I/O backends, qlty makes high-resolution scientific imaging analytics more accessible on hardware with limited memory, enabling robust segmentation and denoising workflows.

Abstract

In scientific imaging, deep learning has become a pivotal tool for image analytics. However, handling large volumetric datasets, which often exceed the memory capacity of standard GPUs, require special attention when subjected to deep learning efforts. This paper introduces qlty, a toolkit designed to address these challenges through tensor management techniques. qlty offers robust methods for subsampling, cleaning, and stitching of large-scale spatial data, enabling effective training and inference even in resource-limited environments.

qlty: handling large tensors in scientific imaging

TL;DR

qlty tackles memory bottlenecks in large volumetric scientific imaging by providing out-of-core tensor management with patch-based subsampling, edge-aware augmentation, and weighted stitching. The method uses sliding windows over tensors with shapes such as to generate , complemented by border-aware cleaning and -stored mean/normalization arrays processed via to fuse results. A 3D tomographic segmentation example demonstrates practical viability, including data duplication from patching, an eight-model SMSNet ensemble, and coherent full-volume inference. By integrating with PyTorch and parallel I/O backends, qlty makes high-resolution scientific imaging analytics more accessible on hardware with limited memory, enabling robust segmentation and denoising workflows.

Abstract

In scientific imaging, deep learning has become a pivotal tool for image analytics. However, handling large volumetric datasets, which often exceed the memory capacity of standard GPUs, require special attention when subjected to deep learning efforts. This paper introduces qlty, a toolkit designed to address these challenges through tensor management techniques. qlty offers robust methods for subsampling, cleaning, and stitching of large-scale spatial data, enabling effective training and inference even in resource-limited environments.
Paper Structure (9 sections, 2 figures, 3 tables, 1 algorithm)

This paper contains 9 sections, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: High level overview of 's functionality, with specific methods depicted in red. A - For generating training data with a smaller spatial footprint, subsamples a sparsely annotated tensor $\mathcal{T}$ into smaller chunks $\mathcal{S}$ while discarding parts of $\mathcal{T}$ for which no training data is available. B - In inference scenarios, iterates over data tensor $\mathcal{T}$ yielding manageable chunks of data that can be subjected to a neural network of choice. The resulting output is iteratively placed back into a tensor of the right size.
  • Figure 2: Segmentation results. A -- A 2D section of the resulting segmentation of the tomographic data (grey) is overlaid with the class probability map for the intermediary filaments (multi-color heatmap) and part of the hand annotated voxels (green, red). Boundaries of the subsampled tensor are shown as black doted lines, while the purple solid line indicates the location of the seam where two adjacent tensors meet when taking into account the border weighting procedure. B -- A 3D overview of the tomographic density (magma heatmap), hand-labeled voxels (light-green 2D patches) and resulting class labels in 3D for the filaments from the probability map after a simple connected component sizing filter (bright green).