Symbol Preference Aware Generative Models for Recovering Variable Names from Stripped Binary
Xiangzhe Xu, Zhuo Zhang, Zian Su, Ziyang Huang, Shiwei Feng, Yapeng Ye, Nan Jiang, Danning Xie, Siyuan Cheng, Lin Tan, Xiangyu Zhang
TL;DR
GenNm tackles the challenge of recovering meaningful variable names from fully stripped binaries by treating it as a generative, context-aware task rather than a closed-vocabulary classification problem. It fine-tunes pre-trained code-language models on decompiled code with local and contextual inputs, and introduces Symbol Preference Optimization to align model outputs with developer naming preferences. Inference is performed iteratively along the program call graph, using context propagation and a name-validation step to ensure cross-function semantic consistency. Across two large datasets, GenNm yields consistent improvements over state-of-the-art baselines, including substantial gains when ground-truth names are unseen during training, and it outperforms black-box LLMs in precision and semantic relevance. The work demonstrates that combining context-aware fine-tuning, symbol-preference guidance, and iterative, graph-based context augmentation significantly enhances variable-name recovery, with clear implications for malware analysis and binary understanding.
Abstract
Decompilation aims to recover the source code form of a binary executable. It has many security applications, such as malware analysis, vulnerability detection, and code hardening. A prominent challenge in decompilation is to recover variable names. We propose a novel technique that leverages the strengths of generative models while mitigating model biases. We build a prototype, GenNm, from pre-trained generative models CodeGemma-2B, CodeLlama-7B, and CodeLlama-34B. We finetune GenNm on decompiled functions and teach models to leverage contextual information. GenNm includes names from callers and callees while querying a function, providing rich contextual information within the model's input token limitation. We mitigate model biases by aligning the output distribution of models with symbol preferences of developers. Our results show that GenNm improves the state-of-the-art name recovery precision by 5.6-11.4 percentage points on two commonly used datasets and improves the state-of-the-art by 32% (from 17.3% to 22.8%) in the most challenging setup where ground-truth variable names are not seen in the training dataset.
