Multimodal Instruction Disassembly with Covariate Shift Adaptation and Real-time Implementation
Yunkai Bai, Jungmin Park, Domenic Forte
TL;DR
This work tackles real-time, non-invasive disassembly from side-channel traces by introducing a compact dual-channel platform (RASCv3) that simultaneously captures power and EM signals. It marries mutual-information-based feature fusion with minimum-redundancy feature selection (mRMR) and a covariate-shift minimization scheme implemented on a resource-constrained FPGA to achieve real-time classification of ARM/ AVR instructions, demonstrated on six benchmarks with an 8-bit Arduino UNO. Key contributions include a formal MI-driven dual-channel fusion framework, a self-enhancing QDA-based classifier for non-stationary environments, and a practical real-time pipeline that outperforms single-channel approaches while maintaining feasibility on low-cost hardware. The results show offline accuracy near 90% and real-time accuracy around 80%, with notable improvements over time, and they discuss challenges and directions for extending to more complex targets and DVFS scenarios, highlighting significant implications for malware detection and OT monitoring. Overall, the approach provides a scalable path to in-situ, real-time side-channel disassembly using multimodal signals and adaptive classification on compact hardware.
Abstract
Side-channel based instruction disassembly has been proposed as a low-cost and non-invasive approach for security applications such as IP infringement detection, code flow analysis, malware detection, and reconstructing unknown code from obsolete systems. However, existing approaches to side-channel based disassembly rely on setups to collect and process side-channel traces that make them impractical for real-time applications. In addition, they rely on fixed classifiers that cannot adapt to statistical deviations in side-channels caused by different operating environments. In this article, we advance the state of the art in side-channel based disassembly in multiple ways. First, we introduce a new miniature platform, RASCv3, that can simultaneously collect power and EM measurements from a target device and subsequently process them for instruction disassembly in real time. Second, we devise a new approach to combine and select features from power and EM traces using information theory that improves classification accuracy and avoids the curse of dimensionality. Third, we explore covariate shift adjustment techniques that further improve accuracy over time and in response to statistical changes. The proposed methodology is demonstrated on six benchmarks, and the recognition rates of offline and real-time instruction disassemblers are compared for single- and multi-modal cases with a variety of classifiers and over time. Since the proposed approach is only applied to an 8-bit Arduino UNO, we also discuss challenges of extending to more complex targets.
