Differences That Matter: Auditing Models for Capability Gap Discovery and Rectification
Qihao Liu, Chengzhi Mao, Yaojie Liu, Alan Yuille, Wen-Sheng Chu
TL;DR
AuditDM introduces an RL-based MLLM auditor that actively discovers capability gaps by generating failure-inducing question–image pairs to maximize cross-model disagreement. The framework yields annotation-free, targeted data for rectification and demonstrates substantial performance gains across Gemma3 and PaliGemma2 in 16 benchmarks, sometimes surpassing larger models. By focusing on interpretable failure modes and a closed-loop improvement cycle, AuditDM addresses diminishing returns from mere data scaling and offers a scalable path to continual MLLM enhancement. The work highlights the practical value of model auditing as a diagnostic and corrective tool in multimodal AI systems.
Abstract
Conventional evaluation methods for multimodal LLMs (MLLMs) lack interpretability and are often insufficient to fully disclose significant capability gaps across models. To address this, we introduce AuditDM, an automated framework that actively discovers and rectifies MLLM failure modes by auditing their divergence. AuditDM fine-tunes an MLLM as an auditor via reinforcement learning to generate challenging questions and counterfactual images that maximize disagreement among target models. Once trained, the auditor uncovers diverse, interpretable exemplars that reveal model weaknesses and serve as annotation-free data for rectification. When applied to SoTA models like Gemma-3 and PaliGemma-2, AuditDM discovers more than 20 distinct failure types. Fine-tuning on these discoveries consistently improves all models across 16 benchmarks, and enables a 3B model to surpass its 28B counterpart. Our results suggest that as data scaling hits diminishing returns, targeted model auditing offers an effective path to model diagnosis and improvement.
