Table of Contents
Fetching ...

QuantU-Net: Efficient Wearable Medical Imaging Using Bitwidth as a Trainable Parameter

Christiaan Boerkamp, Akhil John Thomas

TL;DR

QuantU-Net tackles the deployment of U-Net on wearable medical devices by treating layer bitwidth as a trainable parameter via quantization-aware training with Brevitas. It achieves an average bitwidth of about $4.24$ bits and an approximately $8\times$ reduction in model size while maintaining a validation accuracy of $94.25\%$, only $1.89\%$ below the floating-point baseline of $96.14\%$. The training objective combines Binary Cross-Entropy, Dice loss, and a bitwidth regularization term with $\lambda = 0.25$, enabling hardware-aware optimization. This work demonstrates the feasibility of real-time, low-power tumor segmentation on wearable hardware such as FPGAs, bringing practical AI-assisted diagnosis closer to continuous patient monitoring.

Abstract

Medical image segmentation, particularly tumor segmentation, is a critical task in medical imaging, with U-Net being a widely adopted convolutional neural network (CNN) architecture for this purpose. However, U-Net's high computational and memory requirements pose challenges for deployment on resource-constrained devices such as wearable medical systems. This paper addresses these challenges by introducing QuantU-Net, a quantized version of U-Net optimized for efficient deployment on low-power devices like Field-Programmable Gate Arrays (FPGAs). Using Brevitas, a PyTorch library for quantization-aware training, we quantize the U-Net model, reducing its precision to an average of 4.24 bits while maintaining a validation accuracy of 94.25%, only 1.89% lower than the floating-point baseline. The quantized model achieves an approximately 8x reduction in size, making it suitable for real-time applications in wearable medical devices. We employ a custom loss function that combines Binary Cross-Entropy (BCE) Loss, Dice Loss, and a bitwidth loss function to optimize both segmentation accuracy and the size of the model. Using this custom loss function, we have significantly reduced the training time required to find an optimal combination of bitwidth and accuracy from a hypothetical 6^23 number of training sessions to a single training session. The model's usage of integer arithmetic highlights its potential for deployment on FPGAs and other designated AI accelerator hardware. This work advances the field of medical image segmentation by enabling the deployment of deep learning models on resource-constrained devices, paving the way for real-time, low-power diagnostic solutions in wearable healthcare applications.

QuantU-Net: Efficient Wearable Medical Imaging Using Bitwidth as a Trainable Parameter

TL;DR

QuantU-Net tackles the deployment of U-Net on wearable medical devices by treating layer bitwidth as a trainable parameter via quantization-aware training with Brevitas. It achieves an average bitwidth of about bits and an approximately reduction in model size while maintaining a validation accuracy of , only below the floating-point baseline of . The training objective combines Binary Cross-Entropy, Dice loss, and a bitwidth regularization term with , enabling hardware-aware optimization. This work demonstrates the feasibility of real-time, low-power tumor segmentation on wearable hardware such as FPGAs, bringing practical AI-assisted diagnosis closer to continuous patient monitoring.

Abstract

Medical image segmentation, particularly tumor segmentation, is a critical task in medical imaging, with U-Net being a widely adopted convolutional neural network (CNN) architecture for this purpose. However, U-Net's high computational and memory requirements pose challenges for deployment on resource-constrained devices such as wearable medical systems. This paper addresses these challenges by introducing QuantU-Net, a quantized version of U-Net optimized for efficient deployment on low-power devices like Field-Programmable Gate Arrays (FPGAs). Using Brevitas, a PyTorch library for quantization-aware training, we quantize the U-Net model, reducing its precision to an average of 4.24 bits while maintaining a validation accuracy of 94.25%, only 1.89% lower than the floating-point baseline. The quantized model achieves an approximately 8x reduction in size, making it suitable for real-time applications in wearable medical devices. We employ a custom loss function that combines Binary Cross-Entropy (BCE) Loss, Dice Loss, and a bitwidth loss function to optimize both segmentation accuracy and the size of the model. Using this custom loss function, we have significantly reduced the training time required to find an optimal combination of bitwidth and accuracy from a hypothetical 6^23 number of training sessions to a single training session. The model's usage of integer arithmetic highlights its potential for deployment on FPGAs and other designated AI accelerator hardware. This work advances the field of medical image segmentation by enabling the deployment of deep learning models on resource-constrained devices, paving the way for real-time, low-power diagnostic solutions in wearable healthcare applications.

Paper Structure

This paper contains 18 sections, 5 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Individual layer bitwidths over epochs. The plot shows how the bitwidths of different layers in the QuantU-Net model evolve during training. Layers such as the bottleneck and upconvolutional layers show dynamic adjustments to maintain accuracy while minimizing bitwidth.
  • Figure 2: Accuracy and average bitwidth over epochs. The left y-axis shows the validation accuracy, while the right y-axis shows the average bitwidth across the model. The plot demonstrates the trade-off between accuracy and bitwidth reduction, with accuracy remaining stable as the average bitwidth converges to around 4.24 bits.
  • Figure 3: Bitwidth loss vs segmentation loss. The plot illustrates the relationship between the bitwidth regularization loss and the segmentation loss over training epochs. The bitwidth loss decreases as the model optimizes for lower precision, while the segmentation loss remains stable, indicating that the model maintains accuracy despite the reduction in bitwidth.