SeLoRA: Self-Expanding Low-Rank Adaptation of Latent Diffusion Model for Medical Image Synthesis

Yuchen Mao; Hongwei Li; Wei Pang; Giorgos Papanastasiou; Guang Yang; Chengjia Wang

SeLoRA: Self-Expanding Low-Rank Adaptation of Latent Diffusion Model for Medical Image Synthesis

Yuchen Mao, Hongwei Li, Wei Pang, Giorgos Papanastasiou, Guang Yang, Chengjia Wang

TL;DR

SeLoRA introduces a self-expanding low-rank adaptation for latent diffusion models aimed at medical image synthesis. By initializing with a minimal rank and using Fisher-information-based criteria to decide when to grow per-layer ranks, SeLoRA allocates capacity where it most improves image quality while maintaining a small parameter footprint. Empirical results on IU X-RAY and Montgomery County CXR show SeLoRA outperforms LoRA, DyLoRA, and AdaLoRA in both fidelity (FID) and text–image alignment (CLIP), with interpretable rank allocation that emphasizes cross-modal interaction regions. The work highlights FI-guided, dynamic rank expansion as a practical strategy for efficient fine-tuning of diffusion models in data-scarce medical settings and points to extensions to 3D and broader clinical applications.

Abstract

The persistent challenge of medical image synthesis posed by the scarcity of annotated data and the need to synthesize `missing modalities' for multi-modal analysis, underscored the imperative development of effective synthesis methods. Recently, the combination of Low-Rank Adaptation (LoRA) with latent diffusion models (LDMs) has emerged as a viable approach for efficiently adapting pre-trained large language models, in the medical field. However, the direct application of LoRA assumes uniform ranking across all linear layers, overlooking the significance of different weight matrices, and leading to sub-optimal outcomes. Prior works on LoRA prioritize the reduction of trainable parameters, and there exists an opportunity to further tailor this adaptation process to the intricate demands of medical image synthesis. In response, we present SeLoRA, a Self-Expanding Low-Rank Adaptation Module, that dynamically expands its ranking across layers during training, strategically placing additional ranks on crucial layers, to allow the model to elevate synthesis quality where it matters most. The proposed method not only enables LDMs to fine-tune on medical data efficiently but also empowers the model to achieve improved image quality with minimal ranking. The code of our SeLoRA method is publicly available on https://anonymous.4open.science/r/SeLoRA-980D .

SeLoRA: Self-Expanding Low-Rank Adaptation of Latent Diffusion Model for Medical Image Synthesis

TL;DR

Abstract

Paper Structure (12 sections, 5 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 12 sections, 5 equations, 3 figures, 1 table, 1 algorithm.

Introduction
Related Work on LoRA
SeLoRA
How to expand?
When to expand?
Datasets for Evaluation
Implementation Details
Evaluation Metric
Results
Rank Allocation Analysis
The Impact of $\lambda$
Conclusion

Figures (3)

Figure 1: Training illustration of a single SeLoRA. SeLoRA behaves similarly to a basic LoRA during training. However, it is tested for the expanded rank every $t$ step and is progressively expanded if the FI-Ratio exceeds the desired threshold.
Figure 2: Qualitative comparison results obtained on the IU-Xray data, generated by fine-tuning stable diffusion models injected with various LoRA variants. Prompt used: "Heart size and vascularity normal. These contour normal. Lungs clear. No pleural effusions or pneumothoraces." More sample results are presented in the Appendix.
Figure 3: The final rank of SeLoRA fine-tuned on stable diffusion with IU-XRay Dataset. The x-axis represents the layer index, and y-axis indicates the corresponding attention's weight. SeLoRA places more rank on crucial layers.

SeLoRA: Self-Expanding Low-Rank Adaptation of Latent Diffusion Model for Medical Image Synthesis

TL;DR

Abstract

SeLoRA: Self-Expanding Low-Rank Adaptation of Latent Diffusion Model for Medical Image Synthesis

Authors

TL;DR

Abstract

Table of Contents

Figures (3)