Table of Contents
Fetching ...

SkinGPT-R1: Adapter-Only Dual Distillation for Efficient Dermatology Reasoning

Yuhao Shen, Jiahe Qian, Zhangtianyi Chen, Yuanhao He, Juexiao Zhou

TL;DR

SkinGPT-R1 addresses the need for explicit, verifiable reasoning in dermatology-capable vision-language systems by introducing DermCoT, a dermatology-centered CoT corpus, and DermEval with DermBench to align and benchmark clinician-rated reasoning quality. The architecture uses a frozen Vision-R1 backbone augmented by adapter-only visual distillation and a low-rank language bias, enabling dermatology priors and evidence-first narratives without incurring latency penalties. Empirical results show SkinGPT-R1 achieves leading performance on DermBench across six clinician-defined dimensions and yields stable zero-shot accuracy gains on three dermatology classification benchmarks, with ablations confirming the complementary value of DermCoT supervision and visual distillation. The work offers a practical, efficient pathway for domain-specific chain-of-thought modeling in dermatology and suggests a transferable framework for other image-driven medical specialties.

Abstract

We present SkinGPT-R1, a dermatology focused vision language model that makes diagnostic chain of thought reasoning explicit, step by step, and verifiable. To support skin specific reasoning, we build DermCoT, a corpus of standardized dermatologic chain of thought narratives that combines 10,000 DermEval filtered training cases with 3,000 dermatologist scored certified cases, and we define DermEval as a physician aligned six dimensional evaluator and DermBench as the corresponding benchmark for dermatologic chain of thought quality. On DermBench, across 14 general, reasoning, and medical vision language models, SkinGPT-R1 achieves an average score of 4.031 out of 5 over the six clinician defined dimensions, ranks 1st among all systems, and improves the average score over Vision-R1 by about 41%. On three dermatology classification benchmarks, SkinGPT-R1 delivers stable accuracy gains over Vision-R1 and remains competitive among strong vision language models. Ablation results further show that DermCoT based chain of thought supervision provides substantial improvements over the base model and that adding dermatology aware visual distillation yields consistent additional gains in both narrative quality and recognition.

SkinGPT-R1: Adapter-Only Dual Distillation for Efficient Dermatology Reasoning

TL;DR

SkinGPT-R1 addresses the need for explicit, verifiable reasoning in dermatology-capable vision-language systems by introducing DermCoT, a dermatology-centered CoT corpus, and DermEval with DermBench to align and benchmark clinician-rated reasoning quality. The architecture uses a frozen Vision-R1 backbone augmented by adapter-only visual distillation and a low-rank language bias, enabling dermatology priors and evidence-first narratives without incurring latency penalties. Empirical results show SkinGPT-R1 achieves leading performance on DermBench across six clinician-defined dimensions and yields stable zero-shot accuracy gains on three dermatology classification benchmarks, with ablations confirming the complementary value of DermCoT supervision and visual distillation. The work offers a practical, efficient pathway for domain-specific chain-of-thought modeling in dermatology and suggests a transferable framework for other image-driven medical specialties.

Abstract

We present SkinGPT-R1, a dermatology focused vision language model that makes diagnostic chain of thought reasoning explicit, step by step, and verifiable. To support skin specific reasoning, we build DermCoT, a corpus of standardized dermatologic chain of thought narratives that combines 10,000 DermEval filtered training cases with 3,000 dermatologist scored certified cases, and we define DermEval as a physician aligned six dimensional evaluator and DermBench as the corresponding benchmark for dermatologic chain of thought quality. On DermBench, across 14 general, reasoning, and medical vision language models, SkinGPT-R1 achieves an average score of 4.031 out of 5 over the six clinician defined dimensions, ranks 1st among all systems, and improves the average score over Vision-R1 by about 41%. On three dermatology classification benchmarks, SkinGPT-R1 delivers stable accuracy gains over Vision-R1 and remains competitive among strong vision language models. Ablation results further show that DermCoT based chain of thought supervision provides substantial improvements over the base model and that adding dermatology aware visual distillation yields consistent additional gains in both narrative quality and recognition.

Paper Structure

This paper contains 27 sections, 5 equations, 24 figures, 3 tables.

Figures (24)

  • Figure 1: Overview of SkinGPT-R1. Left: Example dialogue illustrating a typical user-model interaction and a representative model response, highlighting conversational flow and output format. Right: Model architecture and workflow featuring a multimodal backbone with lightweight adapters, trained via supervised fine-tuning and teacher-guided distillation; inference uses the integrated vision-language stack.
  • Figure 2: Workflow of DermCoT Construction: raw images are converted into consistent three-layer CoT narratives by generating observation-only captions with a pretrained VLM comanici2025gemini, drafting label-aware reasoning that concludes with a diagnosis jaech2024openai, and normalizing the text for coherence guo2025deepseek.
  • Figure 3: Data evaluation and split. Each image-CoT pair is scored on six dimensions (Accuracy, Safety, Medical Groundedness, Clinical Coverage, Reasoning Coherence, Description Precision). Upper branch: 3,000 cases are reviewed by board-certified dermatologists and constitute the certified test set. Lower branch: DermEval, an evaluator aligned to physician scoring, rates 15,000 candidates and selects 10,000 pairs whose mean score across the six dimensions is at least 4.5/5 to form the training set.
  • Figure 4: DermBench Case with reference diagnosis Rosacea nose.
  • Figure 5: Generalized pustular psoriasis
  • ...and 19 more figures