ISLE: An Intelligent Streaming Framework for High-Throughput AI Inference in Medical Imaging
Pranav Kulkarni, Sean Garin, Adway Kanhere, Eliot Siegel, Paul H. Yi, Vishwa S. Parekh
TL;DR
This work tackles bandwidth and compute bottlenecks in streaming medical images for AI inference by introducing ISLE, an intelligent streaming framework built on High-Throughput JPEG 2000 (HTJ2K) with progressive encoding. ISLE streams only the necessary sub-resolutions via a Progressive Encoder, a Stream Optimizer that selects the optimal decomposition, and a Progressive Decoder that reconstructs the chosen resolution, all without compromising AI performance. Across NIH, CheXpert, and MIMIC datasets (including DICOM), ISLE achieves data transmission reductions near 98%, decode-time reductions near 98%, and substantial throughput improvements, while maintaining AUROC comparable to full-resolution data. This approach enables scalable, cost-effective clinical AI inference at scale and could help democratize deployment across healthcare settings, with future work extending to 3D imaging and other AI tasks.
Abstract
As the adoption of Artificial Intelligence (AI) systems within the clinical environment grows, limitations in bandwidth and compute can create communication bottlenecks when streaming imaging data, leading to delays in patient care and increased cost. As such, healthcare providers and AI vendors will require greater computational infrastructure, therefore dramatically increasing costs. To that end, we developed ISLE, an intelligent streaming framework for high-throughput, compute- and bandwidth- optimized, and cost effective AI inference for clinical decision making at scale. In our experiments, ISLE on average reduced data transmission by 98.02% and decoding time by 98.09%, while increasing throughput by 2,730%. We show that ISLE results in faster turnaround times, and reduced overall cost of data, transmission, and compute, without negatively impacting clinical decision making using AI systems.
