Table of Contents
Fetching ...

Data Petri Nets meet Probabilistic Programming (Extended version)

Martin Kuhn, Joscha Grüger, Christoph Matheja, Andrey Rivkin

TL;DR

This work proposes a formal bridge between Data Petri Nets (DPNs) and probabilistic programming by translating DPNs with schedulers into probabilistic programs that preserve the exact run distribution under a given scheduler. The approach defines a rigorous DPN execution semantics, introduces a probabilistic scheduler, and constructs a $\textsc{PPL}$ program $C_{sim}$ (with $C_{init}$ and a main loop) that reproduces all legal net runs, enabling exact simulation and Bayesian inference for process mining tasks. The authors prove correctness of the translation and demonstrate a proof-of-concept implementation in WebPPL, validating the method on complex nets and illustrating potential benefits for log generation, distribution analysis, and what-if analyses. The work enables statistical guarantees in event-log synthesis and PM analytics, leveraging existing PP tooling for enhanced insight into data-aware processes.

Abstract

Probabilistic programming (PP) is a programming paradigm that allows for writing statistical models like ordinary programs, performing simulations by running those programs, and analyzing and refining their statistical behavior using powerful inference engines. This paper takes a step towards leveraging PP for reasoning about data-aware processes. To this end, we present a systematic translation of Data Petri Nets (DPNs) into a model written in a PP language whose features are supported by most PP systems. We show that our translation is sound and provides statistical guarantees for simulating DPNs. Furthermore, we discuss how PP can be used for process mining tasks and report on a prototype implementation of our translation. We also discuss further analysis scenarios that could be easily approached based on the proposed translation and available PP tools.

Data Petri Nets meet Probabilistic Programming (Extended version)

TL;DR

This work proposes a formal bridge between Data Petri Nets (DPNs) and probabilistic programming by translating DPNs with schedulers into probabilistic programs that preserve the exact run distribution under a given scheduler. The approach defines a rigorous DPN execution semantics, introduces a probabilistic scheduler, and constructs a program (with and a main loop) that reproduces all legal net runs, enabling exact simulation and Bayesian inference for process mining tasks. The authors prove correctness of the translation and demonstrate a proof-of-concept implementation in WebPPL, validating the method on complex nets and illustrating potential benefits for log generation, distribution analysis, and what-if analyses. The work enables statistical guarantees in event-log synthesis and PM analytics, leveraging existing PP tooling for enhanced insight into data-aware processes.

Abstract

Probabilistic programming (PP) is a programming paradigm that allows for writing statistical models like ordinary programs, performing simulations by running those programs, and analyzing and refining their statistical behavior using powerful inference engines. This paper takes a step towards leveraging PP for reasoning about data-aware processes. To this end, we present a systematic translation of Data Petri Nets (DPNs) into a model written in a PP language whose features are supported by most PP systems. We show that our translation is sound and provides statistical guarantees for simulating DPNs. Furthermore, we discuss how PP can be used for process mining tasks and report on a prototype implementation of our translation. We also discuss further analysis scenarios that could be easily approached based on the proposed translation and available PP tools.
Paper Structure (17 sections, 12 theorems, 47 equations, 6 figures)

This paper contains 17 sections, 12 theorems, 47 equations, 6 figures.

Key Result

theorem thmcountertheorem

Let $C_{\mathit{loop}}$ be the $\textsc{PPL}$ program constructed for net $N$, goal states $\mathcal{G}$, and scheduler $\mathfrak{S}$ in fig:pcn. For all states $(M,\alpha)$ of $N$,

Figures (6)

  • Figure 1: A simple auction process FelliMW22
  • Figure 2: Syntax of $\textsc{PPL}$.
  • Figure 3: Example program.
  • Figure 4: Semantics of $\textsc{PPL}$ programs. Here, $(s,\ell) \in \textnormal{PSL}$. We denote by $s[x \mapsto v]$ the update of $s$ in which the value of $x$ is set to $v$, i.e. $s[x \mapsto v](y) = v$ if $y = x$, and $s[x \mapsto v](y) = s(y)$, otherwise. $\ell \llbracket \mathit{msg} \rrbracket(s)$ denotes the concatenation of log $\ell$ and the evaluation of message $\mathit{msg}$ in $s$. $\textnormal{guards}(GC)$ and $\textnormal{weight}(GC)$ are defined further below.
  • Figure 5: The $\textsc{PPL}$ program $C_{sim}$ simulating the net $N$.
  • ...and 1 more figures

Theorems & Definitions (28)

  • definition thmcounterdefinition: Data Petri net
  • definition thmcounterdefinition: DPN state
  • definition thmcounterdefinition: Transition enabling, firing
  • definition thmcounterdefinition: DPN Scheduler
  • definition thmcounterdefinition: Step Likelihood
  • definition thmcounterdefinition: Likelihood of a Run
  • definition thmcounterdefinition: Probability of a Run
  • definition thmcounterdefinition: Syntax of $\textsc{PPL}$
  • definition thmcounterdefinition: Semantics of $\textsc{PPL}$
  • theorem thmcountertheorem: Correctness
  • ...and 18 more