The Promise and Pitfalls of WebAssembly: Perspectives from the Industry
Ningyu He, Shangtong Cao, Haoyu Wang, Yao Guo, Xiapu Luo
TL;DR
This work presents the largest-scale in-the-wild measurement of WebAssembly binaries deployed on the Web, addressing hosting environments, binary metadata, security threats, and practical uses. It combines a two-phase data collection pipeline (broad web crawling with phasewise historical reconstruction) and rigorous binary validation to produce a dataset of thousands of real-world Wasm binaries. Key findings reveal that Wasm adoption is still maturing, with substantial security concerns evidenced by phishing-hosted binaries, unmanaged call stacks, statically linked allocators, and imported security APIs. The study offers concrete guidance for web developers, Wasm maintainers, and researchers, including toolchain choices, security enhancements, and the need for semantics-recovery tools to better interpret Wasm binaries. Overall, it highlights both the progress and the vulnerabilities of Wasm in production web ecosystems and sets a foundation for targeted improvements in tooling, security, and ecosystem design.
Abstract
As JavaScript has been criticized for performance and security issues in web applications, WebAssembly (Wasm) was proposed in 2017 and is regarded as the complementation for JavaScript. Due to its advantages like compact-size, native-like speed, and portability, Wasm binaries are gradually used as the compilation target for industrial projects in other high-level programming languages and are responsible for computation-intensive tasks in browsers, e.g., 3D graphic rendering and video decoding. Intuitively, characterizing in-the-wild adopted Wasm binaries from different perspectives, like their metadata, relation with source programming language, existence of security threats, and practical purpose, is the prerequisite before delving deeper into the Wasm ecosystem and beneficial to its roadmap selection. However, currently, there is no work that conducts a large-scale measurement study on in-the-wild adopted Wasm binaries. To fill this gap, we collect the largest-ever dataset to the best of our knowledge, and characterize the status quo of them from industry perspectives. According to the different roles of people engaging in the community, i.e., web developers, Wasm maintainers, and researchers, we reorganized our findings to suggestions and best practices for them accordingly. We believe this work can shed light on the future direction of the web and Wasm.
