Table of Contents
Fetching ...

PySpatial: A High-Speed Whole Slide Image Pathomics Toolkit

Yuechen Yang, Yu Wang, Tianyuan Yao, Ruining Deng, Mengmeng Yin, Shilin Zhao, Haichun Yang, Yuankai Huo

TL;DR

PySpatial tackles the challenge of large Whole Slide Images by eliminating patch-level workflows used in CellProfiler and performing feature extraction directly on computational regions within the WSI. It combines an R-tree spatial index with matrix-based batch computation to preserve spatial context while accelerating calculation of 247 pathomic features across four categories: Size & Shape, Texture, Intensity, and Intensity Distribution. The authors validate PySpatial on two datasets, PEC and KPMP, reporting approximately 10-fold speedups for small, dense objects and about 2-fold speedups for larger, sparse structures, with feature distributions consistent with CellProfiler. They also discuss memory management considerations, offering a matrix size parameter and an alternative object-level API to handle very large objects, underscoring PySpatial's robustness and scalability for large-scale digital pathology.

Abstract

Whole Slide Image (WSI) analysis plays a crucial role in modern digital pathology, enabling large-scale feature extraction from tissue samples. However, traditional feature extraction pipelines based on tools like CellProfiler often involve lengthy workflows, requiring WSI segmentation into patches, feature extraction at the patch level, and subsequent mapping back to the original WSI. To address these challenges, we present PySpatial, a high-speed pathomics toolkit specifically designed for WSI-level analysis. PySpatial streamlines the conventional pipeline by directly operating on computational regions of interest, reducing redundant processing steps. Utilizing rtree-based spatial indexing and matrix-based computation, PySpatial efficiently maps and processes computational regions, significantly accelerating feature extraction while maintaining high accuracy. Our experiments on two datasets-Perivascular Epithelioid Cell (PEC) and data from the Kidney Precision Medicine Project (KPMP)-demonstrate substantial performance improvements. For smaller and sparse objects in PEC datasets, PySpatial achieves nearly a 10-fold speedup compared to standard CellProfiler pipelines. For larger objects, such as glomeruli and arteries in KPMP datasets, PySpatial achieves a 2-fold speedup. These results highlight PySpatial's potential to handle large-scale WSI analysis with enhanced efficiency and accuracy, paving the way for broader applications in digital pathology.

PySpatial: A High-Speed Whole Slide Image Pathomics Toolkit

TL;DR

PySpatial tackles the challenge of large Whole Slide Images by eliminating patch-level workflows used in CellProfiler and performing feature extraction directly on computational regions within the WSI. It combines an R-tree spatial index with matrix-based batch computation to preserve spatial context while accelerating calculation of 247 pathomic features across four categories: Size & Shape, Texture, Intensity, and Intensity Distribution. The authors validate PySpatial on two datasets, PEC and KPMP, reporting approximately 10-fold speedups for small, dense objects and about 2-fold speedups for larger, sparse structures, with feature distributions consistent with CellProfiler. They also discuss memory management considerations, offering a matrix size parameter and an alternative object-level API to handle very large objects, underscoring PySpatial's robustness and scalability for large-scale digital pathology.

Abstract

Whole Slide Image (WSI) analysis plays a crucial role in modern digital pathology, enabling large-scale feature extraction from tissue samples. However, traditional feature extraction pipelines based on tools like CellProfiler often involve lengthy workflows, requiring WSI segmentation into patches, feature extraction at the patch level, and subsequent mapping back to the original WSI. To address these challenges, we present PySpatial, a high-speed pathomics toolkit specifically designed for WSI-level analysis. PySpatial streamlines the conventional pipeline by directly operating on computational regions of interest, reducing redundant processing steps. Utilizing rtree-based spatial indexing and matrix-based computation, PySpatial efficiently maps and processes computational regions, significantly accelerating feature extraction while maintaining high accuracy. Our experiments on two datasets-Perivascular Epithelioid Cell (PEC) and data from the Kidney Precision Medicine Project (KPMP)-demonstrate substantial performance improvements. For smaller and sparse objects in PEC datasets, PySpatial achieves nearly a 10-fold speedup compared to standard CellProfiler pipelines. For larger objects, such as glomeruli and arteries in KPMP datasets, PySpatial achieves a 2-fold speedup. These results highlight PySpatial's potential to handle large-scale WSI analysis with enhanced efficiency and accuracy, paving the way for broader applications in digital pathology.
Paper Structure (1 section, 1 figure)

This paper contains 1 section, 1 figure.

Table of Contents

  1. Introduction

Figures (1)

  • Figure 1: Comparison between PySpatial and CellProfiler pipelines for WSI feature extraction. The top panel illustrates the PySpatial pipeline, which directly processes Whole Slide Images (WSIs) to extract features, simplifying the workflow and improving efficiency. In contrast, the bottom panel represents the traditional CellProfiler pipeline, where WSIs are first divided into smaller patches, with their coordinates recorded, followed by feature extraction at the patch level, and finally mapping features back to the original WSI. PySpatial eliminates the intermediate patch-level processing, resulting in a more streamlined and computationally efficient workflow.