A Concept-Based Explainability Framework for Large Multimodal Models
Jayneel Parekh, Pegah Khayatan, Mustafa Shukor, Alasdair Newson, Matthieu Cord
TL;DR
This work tackles the interpretability of large multimodal models by introducing CoX-LMM, a dictionary-learning framework that extracts multimodal concepts tied to a target token. By constructing a token-centered representation matrix from LMM internals and applying a Semi-NMF decomposition, the method yields a dictionary of concept vectors whose activations can be grounded in both vision (via visual samples) and text (via the unembedding of the LLM). The approach is validated on COCO-based captioning models (e.g., DePALM) and corroborated with LLaVA experiments, showing meaningful multimodal grounding, balanced disentanglement, and useful local interpretations for test samples. The results indicate that deeperTransformer layers better reveal multimodal structure, enabling more transparent insight into how LMMs represent and process multimodal information, with potential to enhance trust and debugging in practical deployments.
Abstract
Large multimodal models (LMMs) combine unimodal encoders and large language models (LLMs) to perform multimodal tasks. Despite recent advancements towards the interpretability of these models, understanding internal representations of LMMs remains largely a mystery. In this paper, we present a novel framework for the interpretation of LMMs. We propose a dictionary learning based approach, applied to the representation of tokens. The elements of the learned dictionary correspond to our proposed concepts. We show that these concepts are well semantically grounded in both vision and text. Thus we refer to these as ``multi-modal concepts''. We qualitatively and quantitatively evaluate the results of the learnt concepts. We show that the extracted multimodal concepts are useful to interpret representations of test samples. Finally, we evaluate the disentanglement between different concepts and the quality of grounding concepts visually and textually. Our code is publicly available at https://github.com/mshukor/xl-vlms
