OpenVNA: A Framework for Analyzing the Behavior of Multimodal Language Understanding System under Noisy Scenarios
Ziqi Yuan, Baozheng Zhang, Hua Xu, Zhiyun Liang, Kai Gao
TL;DR
OpenVNA addresses the need for robust evaluation of multimodal language understanding under real-world noisy conditions by delivering a modular open-source framework that unifies noise injection, global robustness benchmarking, and local analysis. It combines a Noise Injection Toolkit with RealNoiseConfig and FFmpeg-based processing, a Global Robustness Benchmark spanning MOSI, MOSEI, CH-SIMS v2, and MintRec, and a GUI for instance-level diagnostics, enabling consistent cross-model comparisons. Robustness is quantified via Arbitrary Interval Robustness (AIR), defined as $\gamma_{abs}(f)=\int_{\sigma_{min}}^{\sigma_{max}} acc_{\sigma}(f)\, d\sigma$, across multiple perturbation types and modalities, with optional noise-based data augmentation to enhance performance. By providing open access to code, datasets, baselines, and an intuitive local-analysis interface, OpenVNA aims to standardize robustness assessments and accelerate development of dependable multimodal systems in noisy settings.
Abstract
We present OpenVNA, an open-source framework designed for analyzing the behavior of multimodal language understanding systems under noisy conditions. OpenVNA serves as an intuitive toolkit tailored for researchers, facilitating convenience batch-level robustness evaluation and on-the-fly instance-level demonstration. It primarily features a benchmark Python library for assessing global model robustness, offering high flexibility and extensibility, thereby enabling customization with user-defined noise types and models. Additionally, a GUI-based interface has been developed to intuitively analyze local model behavior. In this paper, we delineate the design principles and utilization of the created library and GUI-based web platform. Currently, OpenVNA is publicly accessible at \url{https://github.com/thuiar/OpenVNA}, with a demonstration video available at \url{https://youtu.be/0Z9cW7RGct4}.
