How the (Tensor-) Brain uses Embeddings and Embodiment to Encode Senses and Symbols

Volker Tresp; Hang Li

How the (Tensor-) Brain uses Embeddings and Embodiment to Encode Senses and Symbols

Volker Tresp, Hang Li

TL;DR

The TB framework presents a two-layer brain-inspired architecture combining a subsymbolic representation layer (global workspace) with a symbolic index layer. It formalizes how embeddings link perception with concepts, predicates, and episodes, and introduces a probabilistic interpretation of the cognitive state (pCBS) to support top-down inference and imagination. Through self-generated labels, attention mechanisms, embodiment, and autoencoding, TB demonstrates embedded symbolic reasoning, episodic/semantic memory, and multimodal generalization, while also addressing language precursors and cognitive control. The model unifies perception, memory, and language-like reasoning within a single, recurrent, self-supervised architecture, highlighting both the potential for rich, grounded cognition and the challenges of probabilistic interpretation and memory fidelity.

Abstract

The Tensor Brain (TB) has been introduced as a computational model for perception and memory. This paper provides an overview of the TB model, incorporating recent developments and insights into its functionality. The TB is composed of two primary layers: the representation layer and the index layer. The representation layer serves as a model for the subsymbolic global workspace, a concept derived from consciousness research. Its state represents the cognitive brain state, capturing the dynamic interplay of sensory and cognitive processes. The index layer, in contrast, contains symbolic representations for concepts, time instances, and predicates. In a bottom-up operation, sensory input activates the representation layer, which then triggers associated symbolic labels in the index layer. Conversely, in a top-down operation, symbols in the index layer activate the representation layer, which in turn influences earlier processing layers through embodiment. This top-down mechanism underpins semantic memory, enabling the integration of abstract knowledge into perceptual and cognitive processes. A key feature of the TB is its use of concept embeddings, which function as connection weights linking the index layer to the representation layer. As a concept's ``DNA,'' these embeddings consolidate knowledge from diverse experiences, sensory modalities, and symbolic representations, providing a unified framework for learning and memory.

How the (Tensor-) Brain uses Embeddings and Embodiment to Encode Senses and Symbols

TL;DR

Abstract

Paper Structure (62 sections, 13 equations, 4 figures, 3 algorithms)

This paper contains 62 sections, 13 equations, 4 figures, 3 algorithms.

Introduction
The Development of the Tensor Brain
The Subsymbolic Representation Layer
The Representation Layer and the Cognitive Brain State (CBS)
The Representation Layer, the Global Workspace Theory, and the Blackboard
Evolution Neural Network and Recurrency
The Probabilistic Cognitive Brain State (pCBS)
The Symbolic Index Layer
The Index Layer
Concept Indices
Predicate Indices
Episodic Indices Refer to Time Instances
Symbolic Encoding
A Deterministic Model without Top-down Inference
A Single Sampled Label
...and 47 more sections

Figures (4)

Figure 1: The Tensor Brain (TB) Architecture: The architecture processes scene input (illustrated at the bottom) through the layers of a deep convolutional network, mapping it to the representation layer, which serves as a mathematical model of the global workspace. On the right, the evolution neural network, featuring one hidden layer, provides recurrence. Within the TB framework, this recurrent component is referred to as the dynamic context layer. The representation layer connects to the index layer, which in turn feeds back into the representation layer. The embedding vectors are represented by the columns of matrix $\mathbf{A}$. Through bottom-up and top-down processing, the system can generate multiple labels for a given scene or region of interest (ROI).
Figure 2: Johann Wolfgang von Goethe gazes upon a picturesque landscape, with mountains forming the distant backdrop (perception). As his eyes sweep across the scene, he identifies the Frauenkirche (symbolic labeling) and deduces that the city before him must be Munich (semantic memory). Observing the mountains, he further concludes they must be the Alps (semantic memory). This view triggers memories of his last visit to Munich (episodic memory), particularly a delightful dinner at the Flaucher. Reflecting on the weather forecast, which promises a pleasant evening, Goethe contemplates whether to revisit the Flaucher for another memorable meal (decision support through future episodic memory and imagination).
Figure 3: Experimental results from the Tensor Brain (TB) for perception, enhanced by semantic memory, are shown in tresp2023tensor. In the first region of interest (ROI), the left bounding box is identified as Sparky. The top-ranked labels for this ROI include: Dog, Mammal, LivingBeing, Young, White, and OtherActivity. For the second ROI, the top-ranked labels are: Bench, Furniture, NonLivingBeing, Old, OtherColor, and OtherActivity. The sampled binary statements are: (Dog, sit on, Bench), (LivingBeing, on, Furniture), (Mammal, sit on, Old), (White, sit on, Bench). Additionally, semantic memory enhances perception by providing binary statements related to entities not present in the scene, such as: (Sparky, ownedBy, Jack) and (Sparky, lovedBy, Mary), where Jack and Mary are part of the agent's semantic memory, but not visible in the scene.
Figure 4: Multimodality and Reasoning. Top: Some dimensions in the representation layer are visual, some auditory, some tactile, and some abstract. Different dimensions communicate with different perceptual paths and symbolic indices. The horizontal dotted lines between the indices indicate symbolic semantic memory and reasoning. Bottom: Sequential operations in perception. Indices are sampled and added from top to bottom. The first line indicates that sensory input activates auditory and visual dimensions of the representation layer. Then the index Sparky fires which adds information on auditory and visual dimensions and in addition on tactile and abstract dimensions. The decoding continues with Black, Barking, Happy, Dog and Mammal. Mammal fires because Dog has fired; there is no direct overlap in the representation layer with Sparky. This can be achieved by an adaptation of Sparky's embedding (embedded reasoning). In an episodic memory, the first line is represented by an embedding vector. In semantic memory, the first line is missing, and the process starts by activating the index for Sparky.

How the (Tensor-) Brain uses Embeddings and Embodiment to Encode Senses and Symbols

TL;DR

Abstract

How the (Tensor-) Brain uses Embeddings and Embodiment to Encode Senses and Symbols

Authors

TL;DR

Abstract

Table of Contents

Figures (4)