KRAL: Knowledge and Reasoning Augmented Learning for LLM-assisted Clinical Antimicrobial Therapy
Zhe Li, Yehan Qiu, Yujie Chen, Xiang Zhou
TL;DR
KRAL tackles critical barriers to deploying LLMs in clinical antimicrobial therapy by integrating knowledge distillation and reasoning augmentation within a privacy-preserving, hardware-efficient framework. The three-stage approach—data distillation, agentic reinforcement learning with GRPO, and multi-expert hierarchical evaluation—produces a 32B student model that achieves superior knowledge and reasoning on external benchmarks while drastically reducing training time and memory requirements. Empirical results show KRAL delivering $Accuracy@1$ improvements of $+1.8\%$ (vs SFT) and $+3.6\%$ (vs RAG), and $Pass@1$ improvements of $+27\%$ (vs SFT) and $+27.2\%$ (vs RAG), at roughly $20\%$ of SFT costs and with up to $100\times$ VRAM reductions, enabling on-premise deployment. This paves the way for practical, safe, and scalable AI-assisted antimicrobial decision support in resource-constrained settings, with potential applicability to broader clinical domains.
Abstract
Clinical antimicrobial therapy requires the dynamic integration of pathogen profiles,host factors, pharmacological properties of antimicrobials,and the severity of infection. This complexity imposes fundamental limitations on the applicability of Large Language Models (LLMs) in high-stakes clinical decision-making including knowledge gaps, data privacy concerns, high deployment costs, and limited reasoning capabilities. To address these challenges, we propose KRAL (Knowledge and Reasoning Augmented Learning), a low-cost, scalable, privacy-preserving paradigm that leverages teacher-model reasoning to automatically distill knowledge and reasoning trajectories via answer-to-question reverse generation, employs heuristic learning for semi-supervised data augmentation (reducing manual annotation requirements by approximately 80%), and utilizes agentic reinforcement learning to jointly enhance medical knowledge and reasoning while optimizing computational and memory efficiency. A hierarchical evaluation employing diverse teacher-model proxies reduces assessment costs, while modular interface design facilitates seamless system updates. Experimental results demonstrate that KRAL significantly outperforms traditional Retrieval-Augmented Generation (RAG) and Supervised Fine-Tuning (SFT) methods. It improves knowledge question-answering capability (Accuracy@1 on the external open-source benchmark MEDQA increased by 1.8% vs. SFT and 3.6% vs. RAG) and reasoning capability (Pass@1 on the external benchmark PUMCH Antimicrobial increased by 27% vs. SFT and 27.2% vs. RAG), achieved at about 20% of SFT's long-term training costs. This establishes KRAL as an effective solution for enhancing local LLMs' clinical diagnostic capabilities, enabling low-cost, high-safety deployment in complex medical decision support.
