MulChain: Enabling Advanced Cross-Modal Queries in Hybrid-Storage Blockchains
Zhiyuan Peng, Xin Yin, Gang Wang, Chenhao Ying, Wei Chen, Xikun Jiang, Yibin Xu, Yuan Luo
TL;DR
MulChain introduces a pluggable middleware that enables advanced cross-modal queries on hybrid-storage blockchains without altering existing blockchain cores. It combines two verifiable indexing structures, a gas-efficient BHashTree for time-range queries and a verifiable Trie for fuzzy queries, into a unified SQL-enabled workflow interfacing Ethereum, FISCO BCOS, and IPFS. Empirical results show up to about $78\times$ speedups and an order-of-magnitude VO-size reduction over state-of-the-art baselines, along with broad support for six SQL primitives and cross-blockchain interoperability. The work advances practical cross-modal data querying for DApps while preserving blockchain security and compatibility, with future work focusing on richer cross-modal operations and further performance optimizations.
Abstract
With its decentralization and immutability, blockchain has emerged as a trusted foundation for data management and querying. Because blockchain storage space is limited, large multimodal data files, such as videos, are often stored offline, leaving only lightweight metadata on the chain. While this hybrid storage approach enhances storage efficiency, it introduces significant challenges for executing advanced queries on multimodal data. The metadata stored on-chain is often minimal and may not include all the attributes necessary for queries like time range or fuzzy queries. In addition, existing blockchains do not provide native support for multimodal data querying. Achieving this capability would necessitate extensive modifications to the underlying blockchain framework, even reconstructing its core architecture. Consequently, enabling blockchains with multimodal query capabilities remains a significant problem, which necessitates overcoming the following three key challenges: (1) Designing efficient indexing methods to adapt to varying workloads that involve frequent insertions and query operations; (2) Achieving seamless integration with existing blockchains without altering the underlying infrastructure; (3) Ensuring high query performance while minimizing gas consumption. To address these challenges, we propose MulChain, a novel middleware architecture to enable smooth integration with existing blockchains. At the core of MulChain is the BHashTree, a flexible data structure that dynamically switches between tree and hash nodes based on workload characteristics, ensuring efficient insertion and query operations. Furthermore, the middleware provides standardized interfaces for blockchain systems, unifying query methods across different platforms.
