Low-Rank Knowledge Decomposition for Medical Foundation Models
Yuhang Zhou, Haolin Li, Siyuan Du, Jiangchao Yao, Ya Zhang, Yanfeng Wang
TL;DR
The paper tackles the tension between generality and specialization in medical foundation models and the associated deployment costs. It introduces Low-Rank Knowledge Decomposition (LoRKD), which decomposes a medical foundation model F_p into a shared backbone F_s and T task-specific experts using low-rank factors with g_t = (W_0 + B_t A_t) h_t, trained via an efficient gradient separation mechanism. A task-knowledge switch and parameter fusion enable deploying multiple lightweight experts while keeping a fixed backbone size, with a KL-based transfer loss guiding knowledge transfer from the foundation model. Across RadImagenet, MedMnist, Med-MT, and seven downstream datasets, LoRKD achieves superior performance and transferability with substantially fewer parameters than prior work (KF) and without requiring dual networks, illustrating a practical path to scalable, specialized medical foundation models.
Abstract
The popularity of large-scale pre-training has promoted the development of medical foundation models. However, some studies have shown that although foundation models exhibit strong general feature extraction capabilities, their performance on specific tasks is still inferior to task-specific methods. In this paper, we explore a new perspective called ``Knowledge Decomposition'' to improve the performance on specific medical tasks, which deconstruct the foundation model into multiple lightweight expert models, each dedicated to a particular task, with the goal of improving specialization while concurrently mitigating resource expenditure. To accomplish the above objective, we design a novel framework named Low-Rank Knowledge Decomposition (LoRKD), which explicitly separates graidents by incorporating low-rank expert modules and the efficient knowledge separation convolution. Extensive experimental results demonstrate that the decomposed models perform well in terms of performance and transferability, even surpassing the original foundation models.
