Table of Contents
Fetching ...

Advancing Annotat3D with Harpia: A CUDA-Accelerated Library For Large-Scale Volumetric Data Segmentation

Camila Machado de Araujo, Egon P. B. S. Borges, Ricardo Marcelo Canteiro Grangeiro, Allan Pinto

TL;DR

This work tackles the challenge of large-scale volumetric data segmentation in high-resolution imaging by introducing Harpia, a CUDA-based library integrated into Annotat3D. Harpia implements a memory-safe, chunked processing strategy, enabling interactive segmentation on datasets that exceed single-GPU memory and supporting multi-user HPC workflows via a browser-based interface. Key contributions include a comprehensive GPU-accelerated filtering, annotation, and quantification suite, a 3D label editing workflow, and efficient resource management that outperforms cuCIM and scikit-image in both speed and memory usage, demonstrated on volumes up to $32$ GiB. The system's web-based, human-in-the-loop design and scalable architecture offer practical impact for synchrotron facilities and large-scale biological/material imaging, with planned enhancements toward multi-GPU and heterogeneous computing for even larger datasets.

Abstract

High-resolution volumetric imaging techniques, such as X-ray tomography and advanced microscopy, generate increasingly large datasets that challenge existing tools for efficient processing, segmentation, and interactive exploration. This work introduces new capabilities to Annotat3D through Harpia, a new CUDA-based processing library designed to support scalable, interactive segmentation workflows for large 3D datasets in high-performance computing (HPC) and remote-access environments. Harpia features strict memory control, native chunked execution, and a suite of GPU-accelerated filtering, annotation, and quantification tools, enabling reliable operation on datasets exceeding single-GPU memory capacity. Experimental results demonstrate significant improvements in processing speed, memory efficiency, and scalability compared to widely used frameworks such as NVIDIA cuCIM and scikit-image. The system's interactive, human-in-the-loop interface, combined with efficient GPU resource management, makes it particularly suitable for collaborative scientific imaging workflows in shared HPC infrastructures.

Advancing Annotat3D with Harpia: A CUDA-Accelerated Library For Large-Scale Volumetric Data Segmentation

TL;DR

This work tackles the challenge of large-scale volumetric data segmentation in high-resolution imaging by introducing Harpia, a CUDA-based library integrated into Annotat3D. Harpia implements a memory-safe, chunked processing strategy, enabling interactive segmentation on datasets that exceed single-GPU memory and supporting multi-user HPC workflows via a browser-based interface. Key contributions include a comprehensive GPU-accelerated filtering, annotation, and quantification suite, a 3D label editing workflow, and efficient resource management that outperforms cuCIM and scikit-image in both speed and memory usage, demonstrated on volumes up to GiB. The system's web-based, human-in-the-loop design and scalable architecture offer practical impact for synchrotron facilities and large-scale biological/material imaging, with planned enhancements toward multi-GPU and heterogeneous computing for even larger datasets.

Abstract

High-resolution volumetric imaging techniques, such as X-ray tomography and advanced microscopy, generate increasingly large datasets that challenge existing tools for efficient processing, segmentation, and interactive exploration. This work introduces new capabilities to Annotat3D through Harpia, a new CUDA-based processing library designed to support scalable, interactive segmentation workflows for large 3D datasets in high-performance computing (HPC) and remote-access environments. Harpia features strict memory control, native chunked execution, and a suite of GPU-accelerated filtering, annotation, and quantification tools, enabling reliable operation on datasets exceeding single-GPU memory capacity. Experimental results demonstrate significant improvements in processing speed, memory efficiency, and scalability compared to widely used frameworks such as NVIDIA cuCIM and scikit-image. The system's interactive, human-in-the-loop interface, combined with efficient GPU resource management, makes it particularly suitable for collaborative scientific imaging workflows in shared HPC infrastructures.

Paper Structure

This paper contains 15 sections, 3 figures.

Figures (3)

  • Figure 1: Overall architecture of the Annotat3D web application.
  • Figure 2: Performance evaluation in terms of execution time.
  • Figure 3: Performance evaluation in terms of memory footprint.