Table of Contents
Fetching ...

Edge GPU Aware Multiple AI Model Pipeline for Accelerated MRI Reconstruction and Analysis

Ashiyana Abdul Majeed, Mahmoud Meribout, Safa Mohammed Sali

TL;DR

This paper tackles the challenge of running a multi-model medical-imaging pipeline at the edge by deploying hardware-aware GAN-based MRI reconstruction from CT alongside a diagnostic detector on NVIDIA Jetson platforms. It introduces HaX-CoNN, a scheduler that partitions model layers between the GPU and DLA to balance throughput and eliminate GPU fallback, achieving around 150 FPS in real-time configurations. The authors validate two deployment schemas (standalone edge and client-server) and demonstrate a ~5% accuracy improvement for edge-tuned GANs, with potential to double throughput through dual-GAN configurations on suitable hardware. The work suggests practical edge-ready pipelines for MRI-reconstruction and real-time diagnosis, with implications for remote clinics and latency-sensitive imaging workflows.

Abstract

Advancements in AI have greatly enhanced the medical imaging process, making it quicker to diagnose patients. However, very few have investigated the optimization of a multi-model system with hardware acceleration. As specialized edge devices emerge, the efficient use of their accelerators is becoming increasingly crucial. This paper proposes a hardware-accelerated method for simultaneous reconstruction and diagnosis of \ac{MRI} from \ac{CT} images. Real-time performance of achieving a throughput of nearly 150 frames per second was achieved by leveraging hardware engines available in modern NVIDIA edge GPU, along with scheduling techniques. This includes the GPU and the \ac{DLA} available in both Jetson AGX Xavier and Jetson AGX Orin, which were considered in this paper. The hardware allocation of different layers of the multiple AI models was done in such a way that the ideal time between the hardware engines is reduced. In addition, the AI models corresponding to the \ac{GAN} model were fine-tuned in such a way that no fallback execution into the GPU engine is required without compromising accuracy. Indeed, the accuracy corresponding to the fine-tuned edge GPU-aware AI models exhibited an accuracy enhancement of 5\%. A further hardware allocation of two fine-tuned GPU-aware GAN models proves they can double the performance over the original model, leveraging adequate partitioning on the NVIDIA Jetson AGX Xavier and Orin devices. The results prove the effectiveness of employing hardware-aware models in parallel for medical image analysis and diagnosis.

Edge GPU Aware Multiple AI Model Pipeline for Accelerated MRI Reconstruction and Analysis

TL;DR

This paper tackles the challenge of running a multi-model medical-imaging pipeline at the edge by deploying hardware-aware GAN-based MRI reconstruction from CT alongside a diagnostic detector on NVIDIA Jetson platforms. It introduces HaX-CoNN, a scheduler that partitions model layers between the GPU and DLA to balance throughput and eliminate GPU fallback, achieving around 150 FPS in real-time configurations. The authors validate two deployment schemas (standalone edge and client-server) and demonstrate a ~5% accuracy improvement for edge-tuned GANs, with potential to double throughput through dual-GAN configurations on suitable hardware. The work suggests practical edge-ready pipelines for MRI-reconstruction and real-time diagnosis, with implications for remote clinics and latency-sensitive imaging workflows.

Abstract

Advancements in AI have greatly enhanced the medical imaging process, making it quicker to diagnose patients. However, very few have investigated the optimization of a multi-model system with hardware acceleration. As specialized edge devices emerge, the efficient use of their accelerators is becoming increasingly crucial. This paper proposes a hardware-accelerated method for simultaneous reconstruction and diagnosis of \ac{MRI} from \ac{CT} images. Real-time performance of achieving a throughput of nearly 150 frames per second was achieved by leveraging hardware engines available in modern NVIDIA edge GPU, along with scheduling techniques. This includes the GPU and the \ac{DLA} available in both Jetson AGX Xavier and Jetson AGX Orin, which were considered in this paper. The hardware allocation of different layers of the multiple AI models was done in such a way that the ideal time between the hardware engines is reduced. In addition, the AI models corresponding to the \ac{GAN} model were fine-tuned in such a way that no fallback execution into the GPU engine is required without compromising accuracy. Indeed, the accuracy corresponding to the fine-tuned edge GPU-aware AI models exhibited an accuracy enhancement of 5\%. A further hardware allocation of two fine-tuned GPU-aware GAN models proves they can double the performance over the original model, leveraging adequate partitioning on the NVIDIA Jetson AGX Xavier and Orin devices. The results prove the effectiveness of employing hardware-aware models in parallel for medical image analysis and diagnosis.

Paper Structure

This paper contains 31 sections, 9 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: Proposed schema for the two methodologies
  • Figure 2: Block diagram of the NVIDIA Jetson AGX Orin
  • Figure 3: Difference between models in the timing diagram of the client-server scheme
  • Figure 4: Timing diagram of HaX-CoNN scheduling (case 3) compared to GPU-only (case 1) and GPU-DLA (case 2) HaX-CoNN
  • Figure 5: Simplified architecture of the GAN Pix2Pix model pix2pix
  • ...and 9 more figures