Scalable and Precise Application-Centered Call Graph Construction for Python
Kaifeng Huang, Yixuan Yan, Bihuan Chen, Zixin Tao, Xin Peng
TL;DR
The paper tackles the challenge of scalable, precise call graph construction for Python by introducing Jarvis, an application-centered CGC framework that uses per-function flow-sensitive type graphs ($\mathcal{FTG}$) and on-the-fly CG construction. Jarvis combines an AST-based representation with intra- and inter-procedural analyses to resolve calls efficiently, while maintaining modular summaries ($\mathcal{F}$, $\mathcal{C}$, $\mathcal{I}$) and a global call graph $\mathcal{CG}$ built during analysis. The approach demonstrates significant improvements over the state-of-the-art PyCG in both time and precision, particularly on large real-world programs, and shows practical benefits for dependency management and security analysis. Overall, Jarvis provides a scalable, more precise mechanism for Python CGC that can support large ecosystems and downstream tasks such as debloating and vulnerability reachability analysis.
Abstract
Call graph construction is the foundation of inter-procedural static analysis. PYCG is the state-of-the-art approach for constructing call graphs for Python programs. Unfortunately, PyCG does not scale to large programs when adapted to whole-program analysis where application and dependent libraries are both analyzed. Moreover, PyCG is flow-insensitive and does not fully support Python's features, hindering its accuracy. To overcome these drawbacks, we propose a scalable and precise approach for constructing application-centered call graphs for Python programs, and implement it as a prototype tool JARVIS. JARVIS maintains a type graph (i.e., type relations of program identifiers) for each function in a program to allow type inference. Taking one function as an input, JARVIS generates the call graph on-the-fly, where flow-sensitive intra-procedural analysis and inter-procedural analysis are conducted in turn and strong updates are conducted. Our evaluation on a micro-benchmark of 135 small Python programs and a macro-benchmark of 6 real-world Python applications has demonstrated that JARVIS can significantly improve PYCG by at least 67% faster in time, 84% higher in precision, and at least 20% higher in recall.
