Connector-S: A Survey of Connectors in Multi-modal Large Language Models
Xun Zhu, Zheng Zhang, Xi Chen, Yiming Shi, Miao Li, Ji Wu
TL;DR
This survey addresses the connector component in multi-modal large language models (MLLMs), arguing that connectors—between modality encoders and LLM backbones—are pivotal for performance and adaptability. It introduces a two-pronged taxonomy: atomic operations (mapping, compression, mixture of experts) and holistic designs (multi-layer, multi-encoder, multi-modal), and systematically reviews representative approaches within each category. Key contributions include a structured taxonomy, a comprehensive synthesis of current connector methods with concrete examples, and a discussion of pressing future directions such as high-resolution input, adaptive compression, guide information selection, and interpretability. The work aims to provide a clear roadmap for researchers to design more powerful, efficient connectors that enable flexible, scalable multi-modal reasoning in LLMs.
Abstract
With the rapid advancements in multi-modal large language models (MLLMs), connectors play a pivotal role in bridging diverse modalities and enhancing model performance. However, the design and evolution of connectors have not been comprehensively analyzed, leaving gaps in understanding how these components function and hindering the development of more powerful connectors. In this survey, we systematically review the current progress of connectors in MLLMs and present a structured taxonomy that categorizes connectors into atomic operations (mapping, compression, mixture of experts) and holistic designs (multi-layer, multi-encoder, multi-modal scenarios), highlighting their technical contributions and advancements. Furthermore, we discuss several promising research frontiers and challenges, including high-resolution input, dynamic compression, guide information selection, combination strategy, and interpretability. This survey is intended to serve as a foundational reference and a clear roadmap for researchers, providing valuable insights into the design and optimization of next-generation connectors to enhance the performance and adaptability of MLLMs.
