Table of Contents
Fetching ...

SymbolicAI: A framework for logic-based approaches combining generative models and solvers

Marius-Constantin Dinu, Claudiu Leoveanu-Condrei, Markus Holzleitner, Werner Zellinger, Sepp Hochreiter

TL;DR

SymbolicAI presents a modular neuro-symbolic framework that unifies generative models with a broad set of solvers by treating LLMs as semantic parsers within a probabilistic programming context. It formalizes symbols, expressions, and function composition to build hierarchical computational graphs and introduces the VERTEX score to evaluate multi-step, multi-modal workflows. The work contributes a concrete evaluation protocol and benchmark across associative, multimodal, program synthesis, logic, and graph tasks, highlighting strengths and limitations of current models. The approach aims to enable verifiable, explainable, and domain-invariant problem solving with potential implications for broad AI systems and autonomous agents.

Abstract

We introduce SymbolicAI, a versatile and modular framework employing a logic-based approach to concept learning and flow management in generative processes. SymbolicAI enables the seamless integration of generative models with a diverse range of solvers by treating large language models (LLMs) as semantic parsers that execute tasks based on both natural and formal language instructions, thus bridging the gap between symbolic reasoning and generative AI. We leverage probabilistic programming principles to tackle complex tasks, and utilize differentiable and classical programming paradigms with their respective strengths. The framework introduces a set of polymorphic, compositional, and self-referential operations for multi-modal data that connects multi-step generative processes and aligns their outputs with user objectives in complex workflows. As a result, we can transition between the capabilities of various foundation models with in-context learning capabilities and specialized, fine-tuned models or solvers proficient in addressing specific problems. Through these operations based on in-context learning our framework enables the creation and evaluation of explainable computational graphs. Finally, we introduce a quality measure and its empirical score for evaluating these computational graphs, and propose a benchmark that compares various state-of-the-art LLMs across a set of complex workflows. We refer to the empirical score as the "Vector Embedding for Relational Trajectory Evaluation through Cross-similarity", or VERTEX score for short. The framework codebase and benchmark are linked below.

SymbolicAI: A framework for logic-based approaches combining generative models and solvers

TL;DR

SymbolicAI presents a modular neuro-symbolic framework that unifies generative models with a broad set of solvers by treating LLMs as semantic parsers within a probabilistic programming context. It formalizes symbols, expressions, and function composition to build hierarchical computational graphs and introduces the VERTEX score to evaluate multi-step, multi-modal workflows. The work contributes a concrete evaluation protocol and benchmark across associative, multimodal, program synthesis, logic, and graph tasks, highlighting strengths and limitations of current models. The approach aims to enable verifiable, explainable, and domain-invariant problem solving with potential implications for broad AI systems and autonomous agents.

Abstract

We introduce SymbolicAI, a versatile and modular framework employing a logic-based approach to concept learning and flow management in generative processes. SymbolicAI enables the seamless integration of generative models with a diverse range of solvers by treating large language models (LLMs) as semantic parsers that execute tasks based on both natural and formal language instructions, thus bridging the gap between symbolic reasoning and generative AI. We leverage probabilistic programming principles to tackle complex tasks, and utilize differentiable and classical programming paradigms with their respective strengths. The framework introduces a set of polymorphic, compositional, and self-referential operations for multi-modal data that connects multi-step generative processes and aligns their outputs with user objectives in complex workflows. As a result, we can transition between the capabilities of various foundation models with in-context learning capabilities and specialized, fine-tuned models or solvers proficient in addressing specific problems. Through these operations based on in-context learning our framework enables the creation and evaluation of explainable computational graphs. Finally, we introduce a quality measure and its empirical score for evaluating these computational graphs, and propose a benchmark that compares various state-of-the-art LLMs across a set of complex workflows. We refer to the empirical score as the "Vector Embedding for Relational Trajectory Evaluation through Cross-similarity", or VERTEX score for short. The framework codebase and benchmark are linked below.
Paper Structure (74 sections, 12 equations, 14 figures, 1 algorithm)

This paper contains 74 sections, 12 equations, 14 figures, 1 algorithm.

Figures (14)

  • Figure 1: Our neuro-symbolic framework enables a seamless transition between symbolic and differentiable programming, each with distinct dynamics and strengths. Differentiable programming provides access to foundational and specialist models. Classical programming, on the other hand, shifts between abstraction and implementation, focusing on high-level concepts before delving into the details of implementation.
  • Figure 1: VERTEX Protocol
  • Figure 2: Illustration for NeSy pipeline, showcasing conceptual usage of in-context learning methodologies, domain-specific language (DSL) structures, and the expression evaluations through a NeSy engine based on an LLM and constraint verification. The expression showcases the sorted insert operator $\ll$ and how the information of the symbol $\text{B}$ is included in the symbol $\text{AC}$. The violet placeholder in the DSL Prompt represents an instruction, such as "Insert the right-hand side value into the left-hand value in ascending order." The positions below represent task-specific few-shot examples. The DSL Prompt receives the expression $\omega_{<<}$ and maps it to $\hat{\omega}_{<<}$ that can be processed by the LLM-based NeSy function $\mathcal{V}_{\mathcal{S}^*}$ and outputs a new symbol.
  • Figure 3: a) Illustration of polymorphic context on the example of a SQLExpression type for the add-operator. Without a polymorphic context a regular Expression evaluation concatenates two Symbol objects together. The polymorphic context in SQLExpression overwrites the base behavior such that two added SQL-expressions get semantically combined, not concatenated. b) Illustration of the translation of a Symbol object to a prompt statement to be processed by an LLM in the NeSy engine. The User Input Args can be attached with a Payload from previous executions and gets applied to the Custom Method. The user input with the polymorphic context of the Symbol Object attributes (Static Context and Dynamic Context) are translated to a prompt statement according to the schema of the Prompt Design. The fields Operation, Examples and Template mark operation description, DSL-based prompt examples and template structures respectively. These translations are processed according to PreProcessor and engine-specific formatting. c) Illustrates the evaluation pipeline from user input to output, with multiple translation processes before and after the Engine invocation. The Input gets passed to the Custom Method and reformatted according to a PreProcessor to adhere to DSL-specific structure. The engine then takes the output of the PreProcessor and composes the final prompt according to the engine-specific Prompt Design and resolves polymorphic context and auxiliary fields. The output of the Engine then can be restructured by a PostProcessor to match DSL-requirements of the desired Output and gets applied Constraints to verify the outcome.
  • Figure 4: We showcase a multi-step hierarchical computational graph, with each node in the graph represented by a symbol. The edges are relations between symbols. The left-hand side illustrates how a new node (Symbol 3) is obtained by evaluating an operation with its respective context on a NeSy engine. The right-hand side illustrates the context information window (yellow rectangle) and relationship of the resulting graph with its respective nodes.
  • ...and 9 more figures