Medifact at PerAnsSumm 2025: Leveraging Lightweight Models for Perspective-Specific Summarization of Clinical Q&A Forums
Nadia Saeed
TL;DR
The paper tackles perspective-aware healthcare Q&A summarization by formulating a lightweight, hybrid pipeline that blends weak supervision, sentence-embedding–driven SVM classification, and zero-shot BART-MNLI for perspective labeling, followed by a two-stage summarization using BART for extraction and Pegasus for abstractive refinement. It demonstrates how to balance computational efficiency with contextual accuracy on the PerAnsSumm 2025 task, achieving competitive results and a top-12 placement for MediFact. Key contributions include a modular workflow that reduces reliance on large LLMs, a robust combination of Snorkel-based labeling, and transformer-based summarization, along with a clear evaluation across both span identification/classification and summarization metrics. The study highlights practical avenues for deploying clinically relevant CQA summarization systems with limited resources, and outlines future work in lightweight fine-tuning, quantization, and retrieval-augmented generation.
Abstract
The PerAnsSumm 2025 challenge focuses on perspective-aware healthcare answer summarization (Agarwal et al., 2025). This work proposes a few-shot learning framework using a Snorkel-BART-SVM pipeline for classifying and summarizing open-ended healthcare community question-answering (CQA). An SVM model is trained with weak supervision via Snorkel, enhancing zero-shot learning. Extractive classification identifies perspective-relevant sentences, which are then summarized using a pretrained BART-CNN model. The approach achieved 12th place among 100 teams in the shared task, demonstrating computational efficiency and contextual accuracy. By leveraging pretrained summarization models, this work advances medical CQA research and contributes to clinical decision support systems.
