A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making

Chitra Subramanian; Miao Liu; Naweed Khan; Jonathan Lenchner; Aporva Amarnath; Sarathkrishna Swaminathan; Ryan Riegel; Alexander Gray

A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making

Chitra Subramanian, Miao Liu, Naweed Khan, Jonathan Lenchner, Aporva Amarnath, Sarathkrishna Swaminathan, Ryan Riegel, Alexander Gray

TL;DR

This work tackles the interpretability and uncertainty challenges of multi-agent reinforcement learning for runtime resource management by marrying neural learning with symbolic logic. It introduces an event-driven MPOMDP framework and leverages Logical Neural Networks (LNN) for interpretable rule learning, augmented by Probabilistic Logical Neural Networks (PLNN) for probabilistic inference under partial observability. The key contributions include a formal ED-MPOMDP formulation for power sharing in Heterogeneous System-on-Chip environments, LNN-based Phase 1 rule learning with domain knowledge and guard rails, and a PLNN-based dynamic decision-making pipeline that adapts rules in real time. The results show that LNN rules improve DAG completion times under moderate to heavy load, while PLNN enables robust, probabilistic adaptation to unseen or partially observed states, yielding performance close to ideal targets and offering interpretable diagnostics for runtime control.

Abstract

Multi-agent reinforcement learning (MARL) is well-suited for runtime decision-making in optimizing the performance of systems where multiple agents coexist and compete for shared resources. However, applying common deep learning-based MARL solutions to real-world problems suffers from issues of interpretability, sample efficiency, partial observability, etc. To address these challenges, we present an event-driven formulation, where decision-making is handled by distributed co-operative MARL agents using neuro-symbolic methods. The recently introduced neuro-symbolic Logical Neural Networks (LNN) framework serves as a function approximator for the RL, to train a rules-based policy that is both logical and interpretable by construction. To enable decision-making under uncertainty and partial observability, we developed a novel probabilistic neuro-symbolic framework, Probabilistic Logical Neural Networks (PLNN), which combines the capabilities of logical reasoning with probabilistic graphical models. In PLNN, the upward/downward inference strategy, inherited from LNN, is coupled with belief bounds by setting the activation function for the logical operator associated with each neural network node to a probability-respecting generalization of the Fréchet inequalities. These PLNN nodes form the unifying element that combines probabilistic logic and Bayes Nets, permitting inference for variables with unobserved states. We demonstrate our contributions by addressing key MARL challenges for power sharing in a system-on-chip application.

A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making

TL;DR

Abstract

Paper Structure (31 sections, 20 equations, 12 figures, 1 table)

This paper contains 31 sections, 20 equations, 12 figures, 1 table.

Introduction
Preliminaries
Multiagent POMDPs
MARL for HSoC based on Event-driven MPOMDPs
Probabilistic Logic and PLNN
Related Work
Formal Specification of a PLNN
Motivating Concepts of PLNN
Fréchet Inequalities
Upward and Downward Inference
$J$-Modulation of the Fréchet Inequalities
Upward and Downward Inference Revisited
Experimental Results and Discussion
LNN Rule Learning Implementation
Phase1 Training Method
...and 16 more sections

Figures (12)

Figure 1: (a) A job DAG capturing the dependencies of tasks and a mapping of tasks to tiles; (b) An example of runtime information of the job.
Figure 2: The Venn Diagram associated with two propositions, $A$ and $B$, with fixed marginal probabilities $p(A) = p, p(B) = q$ in the case (a) where $A$ and $B$ are maximally correlated and the cases (b1) and (b2) where $A$ and $B$ are maximally anti-correlated.
Figure 3: (a) Simple upward inference in the case of operands, $A$ and $B$, coming into an OR operational node. The bounds for the $\vee$ node are updated using the associated Fréchet Inequalities (\ref{['eqn:2arg-frechet']}). (b) The analogous downward inference for the same set of nodes. Using bounds on $A \vee B$ and $A$ we can back out inferred bounds on $B$.
Figure 4: (a) A job DAG capturing the dependencies of tasks and a mapping of tasks to tiles; (b) An example of runtime information of the job.
Figure 5: Learned LNN weights for agent tokens requests and interpretable rules based on weight threshold of 0.1.
...and 7 more figures

Theorems & Definitions (2)

Definition 1
Example C.1

A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making

TL;DR

Abstract

A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making

Authors

TL;DR

Abstract

Table of Contents

Figures (12)

Theorems & Definitions (2)