Table of Contents
Fetching ...

Challenges of Multilingual Program Specification and Analysis

Carlo A. Furia, Abhishek Tiwari

TL;DR

This paper surveys the challenges of specifying and analyzing multilingual programs, organizing the landscape along inter-language communication levels (API, IR, Native, System) and specification layers (types, dataflow, effects). It presents concrete API/IR/native/system examples to illustrate cross-language type mismatches, bidirectional dataflow, and memory-model concerns, highlighting how these facets complicate rigorous analysis. The state-of-the-art is reviewed in language-agnostic and language-specific threads, including vulnerability and information-flow analyses for Android hybrid apps and JNI-based ecosystems, while noting gaps in handling dynamic features and scalability. The authors advocate top-down semantic approaches (e.g., K, PLT Redex) to complement bottom-up empirical work, aiming to build principled multilingual analysis frameworks that scale to real-world software stacks and guide practical tooling.

Abstract

Multilingual programs, whose implementations are made of different languages, are gaining traction especially in domains, such as web programming, that particularly benefit from the additional flexibility brought by using multiple languages. In this paper, we discuss the impact that the features commonly used in multilingual programming have on our capability of specifying and analyzing them. To this end, we first outline a few broad categories of multilingual programming, according to the mechanisms that are used for inter-language communication. Based on these categories, we describe several instances of multilingual programs, as well as the intricacies that formally reasoning about their behavior would entail. We also summarize the state of the art in multilingual program analysis, including the challenges that remain open. These contributions can help understand the lay of the land in multilingual program specification and analysis, and motivate further work in this area.

Challenges of Multilingual Program Specification and Analysis

TL;DR

This paper surveys the challenges of specifying and analyzing multilingual programs, organizing the landscape along inter-language communication levels (API, IR, Native, System) and specification layers (types, dataflow, effects). It presents concrete API/IR/native/system examples to illustrate cross-language type mismatches, bidirectional dataflow, and memory-model concerns, highlighting how these facets complicate rigorous analysis. The state-of-the-art is reviewed in language-agnostic and language-specific threads, including vulnerability and information-flow analyses for Android hybrid apps and JNI-based ecosystems, while noting gaps in handling dynamic features and scalability. The authors advocate top-down semantic approaches (e.g., K, PLT Redex) to complement bottom-up empirical work, aiming to build principled multilingual analysis frameworks that scale to real-world software stacks and guide practical tooling.

Abstract

Multilingual programs, whose implementations are made of different languages, are gaining traction especially in domains, such as web programming, that particularly benefit from the additional flexibility brought by using multiple languages. In this paper, we discuss the impact that the features commonly used in multilingual programming have on our capability of specifying and analyzing them. To this end, we first outline a few broad categories of multilingual programming, according to the mechanisms that are used for inter-language communication. Based on these categories, we describe several instances of multilingual programs, as well as the intricacies that formally reasoning about their behavior would entail. We also summarize the state of the art in multilingual program analysis, including the challenges that remain open. These contributions can help understand the lay of the land in multilingual program specification and analysis, and motivate further work in this area.
Paper Structure (28 sections, 4 figures)

This paper contains 28 sections, 4 figures.

Figures (4)

  • Figure 1: Code snippets that demonstrate Java-JavaScript hybrid programs written using Android's WebView framework.
  • Figure 2: Code snippets that demonstrate Scala data structures wrapping Java ones.
  • Figure 3: Code snippets that demonstrate Java Native Interface's (JNI) capabilities.
  • Figure 4: A Python script that communicates with shell command line utilities.