Towards an Oracle for Binary Decomposition Under Compilation Variance
Ang Jia, He Jiang, Zhilei Ren, Xiaochen Li, Zhipeng Yang, Yaxin Duan, Ming Fan, Ting Liu
TL;DR
This work introduces the first compilation-variance–aware oracle for binary decomposition by defining Minimal Equivalent Function Regions (MEFRs) and building ground-truth FCG mappings across 17 compilers, 6 optimizations, and 4 architectures. It provides a labeled dataset and an evaluation framework to rigorously assess anchor- and clustering-based TPL decomposition methods, revealing prevalent under- and over-aggregation issues caused by function inlining and compiler differences. The results demonstrate that current decomposition approaches struggle to robustly detect reused libraries under compilation variance, underscoring the need for compilation-aware techniques. The authors also release their oracle construction toolkit and data to drive future research in robust binary analysis for security and software provenance.
Abstract
Third-Party Library (TPL) detection, which identifies reused libraries in binary code, is critical for software security analysis. At its core, TPL detection depends on binary decomposition-the process of partitioning a monolithic binary into cohesive modules. Existing decomposition methods, whether anchor-based or clustering-based, fundamentally rely on the assumption that reused code exhibits similar function call relationships. However, this assumption is severely undermined by Function Call Graph (FCG) variations introduced by diverse compilation settings, particularly function inlining decisions that drastically alter FCG structures. In this work, we conduct the first systematic empirical study to establish the oracle for optimal binary decomposition under compilation variance. We first develop a labeling method to create precise FCG mappings on a comprehensive dataset compiled with 17 compilers, 6 optimizations, and 4 architectures; then, we identify the minimum semantic-equivalent function regions between FCG variants to derive the ground-truth decomposition. This oracle provides the first rigorous evaluation framework that quantitatively assesses decomposition algorithms under compilation variance. Using this oracle, we evaluate existing methods and expose their critical limitations: they either suffer from under-aggregation failure or over-aggregation failure. Our findings reveal that current decomposition techniques are inadequate for robust TPL detection, highlighting the urgent need for compilation-aware approaches.
