SaccadeDet: A Novel Dual-Stage Architecture for Rapid and Accurate Detection in Gigapixel Images
Wenxi Li, Ruxin Zhang, Haozhe Lin, Yuchen Guo, Chao Ma, Xiaokang Yang
TL;DR
Gigapixel images pose severe speed and accuracy challenges due to vast background and extreme object scale variation. SaccadeDet introduces a dual-stage approach that first uses multi-scale density regression to locate Regions of Interest on downsampled gigapixel data, then applies a scale-normalized gaze stage that processes standardized patches with a megapixel detector. The method achieves up to 8x faster inference than prior gigapixel detectors on PANDA while maintaining high detection accuracy, and extends to Whole Slide Imaging with substantial efficiency gains. This approach offers a practical, scalable solution for fast, accurate gigapixel-level detection in medical and surveillance contexts.
Abstract
The advancement of deep learning in object detection has predominantly focused on megapixel images, leaving a critical gap in the efficient processing of gigapixel images. These super high-resolution images present unique challenges due to their immense size and computational demands. To address this, we introduce 'SaccadeDet', an innovative architecture for gigapixel-level object detection, inspired by the human eye saccadic movement. The cornerstone of SaccadeDet is its ability to strategically select and process image regions, dramatically reducing computational load. This is achieved through a two-stage process: the 'saccade' stage, which identifies regions of probable interest, and the 'gaze' stage, which refines detection in these targeted areas. Our approach, evaluated on the PANDA dataset, not only achieves an 8x speed increase over the state-of-the-art methods but also demonstrates significant potential in gigapixel-level pathology analysis through its application to Whole Slide Imaging.
