EPE-P: Evidence-based Parameter-efficient Prompting for Multimodal Learning with Missing Modalities
Zhe Chen, Xun Lin, Yawen Cui, Zitong Yu
TL;DR
This work tackles missing modalities in multimodal learning by introducing Evidence-based Parameter-Efficient Prompting (EPE-P), a compact prompting framework that uses a single comprehensive prompt plus modality-specific weight matrices. The approach leverages a Block-wise Kronecker-like Multiplication to tailor prompts for various missing-case inputs and integrates prompts into early transformer layers, complemented by an evidential deep learning loss to capture uncertainty and improve decision-making. Key contributions include the BK-M based prompt design with low-rank factorization, an evidence-based loss with a KL-regularizer, and extensive experiments showing improved robustness and efficiency on MM-IMDb and Hateful Memes. The results demonstrate that EPE-P reduces parameter redundancy while achieving superior performance compared to prior prompting methods, making it practical for real-world multimodal systems with incomplete data.
Abstract
Missing modalities are a common challenge in real-world multimodal learning scenarios, occurring during both training and testing. Existing methods for managing missing modalities often require the design of separate prompts for each modality or missing case, leading to complex designs and a substantial increase in the number of parameters to be learned. As the number of modalities grows, these methods become increasingly inefficient due to parameter redundancy. To address these issues, we propose Evidence-based Parameter-Efficient Prompting (EPE-P), a novel and parameter-efficient method for pretrained multimodal networks. Our approach introduces a streamlined design that integrates prompting information across different modalities, reducing complexity and mitigating redundant parameters. Furthermore, we propose an Evidence-based Loss function to better handle the uncertainty associated with missing modalities, improving the model's decision-making. Our experiments demonstrate that EPE-P outperforms existing prompting-based methods in terms of both effectiveness and efficiency. The code is released at https://github.com/Boris-Jobs/EPE-P_MLLMs-Robustness.
