AI for Regulatory Affairs: Balancing Accuracy, Interpretability, and Computational Cost in Medical Device Classification
Yu Han, Aaron Ceross, Jeroen H. M. Bergmann
TL;DR
The paper tackles the challenge of deploying AI for medical device regulatory classification by benchmarking rule-based, traditional ML, deep learning, and LLM approaches on Chinese device descriptions from the NMPA. It assesses models along three axes—classification accuracy, interpretability, and computational cost—and reports that CNNs and XGBoost offer the strongest balance, while LLMs provide interpretability at the expense of accuracy and speed. The study also analyzes interpretability mechanisms, highlights per-class performance differences, and argues for hybrid workflows that combine fast, transparent models with high-accuracy ones, augmented by expert review and scenario-based testing. The result is a practical, regulator-focused framework for selecting and validating AI tools in regulatory affairs, with proposed guidelines for explanation utility, risk-based testing, and governance to enable trustworthy adoption in real-world regulatory pipelines.
Abstract
Regulatory affairs, which sits at the intersection of medicine and law, can benefit significantly from AI-enabled automation. Classification task is the initial step in which manufacturers position their products to regulatory authorities, and it plays a critical role in determining market access, regulatory scrutiny, and ultimately, patient safety. In this study, we investigate a broad range of AI models -- including traditional machine learning (ML) algorithms, deep learning architectures, and large language models -- using a regulatory dataset of medical device descriptions. We evaluate each model along three key dimensions: accuracy, interpretability, and computational cost.
