Table of Contents
Fetching ...

KidneyTalk-open: No-code Deployment of a Private Large Language Model with Medical Documentation-Enhanced Knowledge Database for Kidney Disease

Yongchao Long, Chao Yang, Gongzheng Tang, Jinwei Wang, Zhun Sui, Yuxi Zhou, Shenda Hong, Luxia Zhang

TL;DR

KidneyTalk-open addresses privacy-sensitive medical AI by delivering a no-code desktop solution that integrates local LLM inference with a medical knowledge database and a multi-agent retrieval-augmentation pipeline. The system combines embedded semantic search (BGE-M3) with domain-specific reasoning (DeepSeek-r1) and generation (Qwen2.5) to ground answers in medical documents, while AddRep enhances recall and reduces hallucinations through query refinement, divergent thinking, and knowledge reasoning. Validation on 1,455 CNME-MCQ nephrology questions shows AddRep achieving 29.1% accuracy with a 4.9% rejection rate, outperforming baseline approaches, and comparative case studies demonstrate better adherence to CKD guidelines and personalized management. Overall, KidneyTalk-open offers a practical, privacy-preserving framework for clinical AI at the desktop, enabling traceable, evidence-based Q&A and setting a framework for future multi-modal and broader-domain medical AI systems.

Abstract

Privacy-preserving medical decision support for kidney disease requires localized deployment of large language models (LLMs) while maintaining clinical reasoning capabilities. Current solutions face three challenges: 1) Cloud-based LLMs pose data security risks; 2) Local model deployment demands technical expertise; 3) General LLMs lack mechanisms to integrate medical knowledge. Retrieval-augmented systems also struggle with medical document processing and clinical usability. We developed KidneyTalk-open, a desktop system integrating three technical components: 1) No-code deployment of state-of-the-art (SOTA) open-source LLMs (such as DeepSeek-r1, Qwen2.5) via local inference engine; 2) Medical document processing pipeline combining context-aware chunking and intelligent filtering; 3) Adaptive Retrieval and Augmentation Pipeline (AddRep) employing agents collaboration for improving the recall rate of medical documents. A graphical interface was designed to enable clinicians to manage medical documents and conduct AI-powered consultations without technical expertise. Experimental validation on 1,455 challenging nephrology exam questions demonstrates AddRep's effectiveness: achieving 29.1% accuracy (+8.1% over baseline) with intelligent knowledge integration, while maintaining robustness through 4.9% rejection rate to suppress hallucinations. Comparative case studies with the mainstream products (AnythingLLM, Chatbox, GPT4ALL) demonstrate KidneyTalk-open's superior performance in real clinical query. KidneyTalk-open represents the first no-code medical LLM system enabling secure documentation-enhanced medical Q&A on desktop. Its designs establishes a new framework for privacy-sensitive clinical AI applications. The system significantly lowers technical barriers while improving evidence traceability, enabling more medical staff or patients to use SOTA open-source LLMs conveniently.

KidneyTalk-open: No-code Deployment of a Private Large Language Model with Medical Documentation-Enhanced Knowledge Database for Kidney Disease

TL;DR

KidneyTalk-open addresses privacy-sensitive medical AI by delivering a no-code desktop solution that integrates local LLM inference with a medical knowledge database and a multi-agent retrieval-augmentation pipeline. The system combines embedded semantic search (BGE-M3) with domain-specific reasoning (DeepSeek-r1) and generation (Qwen2.5) to ground answers in medical documents, while AddRep enhances recall and reduces hallucinations through query refinement, divergent thinking, and knowledge reasoning. Validation on 1,455 CNME-MCQ nephrology questions shows AddRep achieving 29.1% accuracy with a 4.9% rejection rate, outperforming baseline approaches, and comparative case studies demonstrate better adherence to CKD guidelines and personalized management. Overall, KidneyTalk-open offers a practical, privacy-preserving framework for clinical AI at the desktop, enabling traceable, evidence-based Q&A and setting a framework for future multi-modal and broader-domain medical AI systems.

Abstract

Privacy-preserving medical decision support for kidney disease requires localized deployment of large language models (LLMs) while maintaining clinical reasoning capabilities. Current solutions face three challenges: 1) Cloud-based LLMs pose data security risks; 2) Local model deployment demands technical expertise; 3) General LLMs lack mechanisms to integrate medical knowledge. Retrieval-augmented systems also struggle with medical document processing and clinical usability. We developed KidneyTalk-open, a desktop system integrating three technical components: 1) No-code deployment of state-of-the-art (SOTA) open-source LLMs (such as DeepSeek-r1, Qwen2.5) via local inference engine; 2) Medical document processing pipeline combining context-aware chunking and intelligent filtering; 3) Adaptive Retrieval and Augmentation Pipeline (AddRep) employing agents collaboration for improving the recall rate of medical documents. A graphical interface was designed to enable clinicians to manage medical documents and conduct AI-powered consultations without technical expertise. Experimental validation on 1,455 challenging nephrology exam questions demonstrates AddRep's effectiveness: achieving 29.1% accuracy (+8.1% over baseline) with intelligent knowledge integration, while maintaining robustness through 4.9% rejection rate to suppress hallucinations. Comparative case studies with the mainstream products (AnythingLLM, Chatbox, GPT4ALL) demonstrate KidneyTalk-open's superior performance in real clinical query. KidneyTalk-open represents the first no-code medical LLM system enabling secure documentation-enhanced medical Q&A on desktop. Its designs establishes a new framework for privacy-sensitive clinical AI applications. The system significantly lowers technical barriers while improving evidence traceability, enabling more medical staff or patients to use SOTA open-source LLMs conveniently.

Paper Structure

This paper contains 24 sections, 10 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Schematic Diagram of the Core Design of KidneyTalk-open. Right: The knowledge database (KB) construction process. It commences with parsing PDF, Word, or Markdown documents into plain text format, proceeds to document chunking for generating knowledge snippets, then conducts knowledge snippet filtering, semantic embedding of the snippets, and concludes with storing the embedded vectors in a vector database. Middle: A schematic of the Hierarchical Navigable Small World (HNSW) vector database, a structure for efficient vector data storage and retrieval, which is employed to manage the knowledge vectors in the KB. Left: The Adaptive Retrieval and Augmentation Pipeline (AddRep) method proposed in this paper for enhancing medical document recall is introduced. It utilizes the collaboration of multiple agents (program modules with specific functions) to expand the user's question space. When a user submits a query, these agents perform operations such as query refinement and divergent thinking to comprehensively obtain relevant knowledge snippets, improving the accuracy and comprehensiveness of answers and ultimately providing an answer.
  • Figure 2: Overview of KB construction page. Entrance: This page is accessed by clicking the book icon. Filter Agent Setting: Although the construction is fully automated, users are allowed to customize the prompt of the Filter in agent settings panel. This is used to intelligently control which knowledge snippets in documents will be embedded into the KB. Document List: Users are required to upload documents by clicking the Upload button for building the KB. The documents existing in the KB will be listed in this area. If the document is "on" state, indicating the document will be retrieved to enhance the LLM. Knowledge Snippet Retrieval: A convenient feature allowing users to quickly retrieve knowledge snippets from the KB. For more detailed about this feature, please refer to Figure \ref{['fig:snippets']}.
  • Figure 3: Usage Example of the Knowledge Snippet Retrieval. Query: We use the query "Treatment plans for patients with type 2 diabetes mellitus combined with stage 4 CKD" as an example to demonstrate this function. TopK: We expects the KB to return the specified number knowledge snippets that are the most semantically similar to the query. Candidate Knowledge Snippets: We refer to the knowledge snippets returned by the retrieval as candidates. For visual ease, we display these candidates in the form of floating cards. Each candidate consists of content, semantic similarity, and source document. Knowledge Snippet Content: We use green underlines to mark the sentences related to the query, including descriptions of various treatment drugs for DKD and lifestyle and dietary patterns. Semantic Similarity: Distance represents the semantic similarity degree between the query and the content. It is calculated using cosine similarity formula. The smaller the value, the more similar they are. Source Document: The document path where the candidate is sourced from.
  • Figure 4: Usage Example of the Chat Module. Entrance: This page is accessed by clicking the chat icon. Query: We use the query "What are the treatment plans for patients with type 2 diabetes and CKD4?" as an example to demonstrate this function. Answer: The final answer of the model Qwen2.5:7B to the query when the knowledge database retrieval enhancement is enabled (enabled at the Docs-Enhanced). Source Docs: KidneyTalk-open has successfully retrieved some helpful snippets from the knowledge database. Divergent Thinking Query & Helpful Reason: Users can not only view the content of the snippets but also the process results of KidneyTalk-open, including the queries from the Divergent Thinking Agent and the generated helpful reasons. The technical details are described in section \ref{['sec:addrep']}.
  • Figure 5: Example Demonstrating the Reasoning Model DeepSeek-R1:7b . We still use the same query as in figure \ref{['fig:qwen']} as an example to validate the reasoning model DeepSeek-R1 on KidneyTalk-open.The superiority of the reasoning model is demonstrated by comparing it with general model Qwen2.5:7b (as shown in Figure \ref{['fig:qwen']}). Reasoning: The content wrapped in the < think></think> tags is the reasoning process of DeepSeek-R1:7b based on the snippets retrieved by the KidneyTalk-open. Answer: Compared with the answer given by Qwen2.5:7B, the answer provided by DeepSeek-R1:7b after reasoning shows that DeepSeek-R1:7b has a higher utilization rate of snippets. We recommend using the reasoning model as the base model, which may lead to more reliable answers. Source Docs: For convenience, we reformatted the 6 retrieved knowledge snippets and displayed them on the right of the picture, highlighting the texts related to the query.
  • ...and 2 more figures