Parameter Choice and Neuro-Symbolic Approaches for Deep Domain-Invariant Learning

Marius-Constantin Dinu

Parameter Choice and Neuro-Symbolic Approaches for Deep Domain-Invariant Learning

Marius-Constantin Dinu

TL;DR

This work establishes a framework for scalable and generalizable broad AI systems applicable across various problem settings, demonstrating how symbolic reasoning and large language models can build universal computational graphs that generalize across domains and problems, contributing to more adaptable AI approaches for real-world applications.

Abstract

As artificial intelligence (AI) systems advance, we move towards broad AI: systems capable of performing well on diverse tasks, understanding context, and adapting rapidly to new scenarios. A central challenge for broad AI systems is to generalize over tasks in related domains and being robust to distribution shifts. Neuro-symbolic (NeSy) AI bridges the gap between symbolic and sub-symbolic paradigms to address these challenges, enabling adaptable, generalizable, and more interpretable systems. The development of broad AI requires advancements in domain adaptation (DA), enabling models trained on source domains to effectively generalize to unseen target domains. Traditional approaches often rely on parameter optimization and fine-tuning, which can be impractical due to high costs and risks of catastrophic forgetting. NeSy AI systems use multiple models and methods to generalize to unseen domains and maintain performance across varying conditions. We analyze common DA and NeSy approaches with a focus on deep domain-invariant learning, extending to real-world challenges such as adapting to continuously changing domains and handling large domain gaps. We showcase state-of-the-art model-selection methods for scenarios with limited samples and introduce domain-specific adaptations without gradient-based updates for cases where model tuning is infeasible. This work establishes a framework for scalable and generalizable broad AI systems applicable across various problem settings, demonstrating how symbolic reasoning and large language models can build universal computational graphs that generalize across domains and problems, contributing to more adaptable AI approaches for real-world applications.

Parameter Choice and Neuro-Symbolic Approaches for Deep Domain-Invariant Learning

TL;DR

Abstract

Paper Structure (29 sections, 13 equations, 3 figures)

This paper contains 29 sections, 13 equations, 3 figures.

Introduction
Domain Shift, Domain Adaptation and Domain-Invariant Learning
Unsupervised Domain Adaptation
Model Selection and Parameter Choice Methods for Unsupervised Domain Adaptation
Parameter Choice Problem
Parameter Choice Contribution
In-Context Learning
Large Language Models
In-Context Learning and Domain-Invariant Learning
Structure of In-Context Learning for Domain Adaptation
Problem Statement
Implicit Contextual Associations
In-Context Associations and Attention
Domain-Invariance in Language Models
Domain-Invariant Associations
...and 14 more sections

Figures (3)

Figure 1: Illustration of domain-invariant learning. Source and target domain features are transformed into domain-invariant features, which are then made class discriminative. This enables the model to distinguish between different classes while maintaining consistent performance across various domains.
Figure 2: UMAP projection from the latent embeddings of four domains before the classification layer of a GPT-Neo $1.3$ billion parameters . The domains are Mathematical, Programming, Natural Language and Random. Each domain includes $40$ samples in their respective domain-specific formulation. We see semantically similar concepts cluster together across domains, highlighting the overlap between domains. The Random domain is furthest apart, since the embedded sequences do not share semantic overlap with the other domains.
Figure 3: We show the distribution of the tokens across four domains, namely Mathematical, Programming, Natural Language and Random domain. The first row shows on the y-axis the $99$-percentile normalized frequency of tokens. The x-axis shows the used unique tokens across all four domains, concatenated in order. The second and third row show the normalized embedding distribution for the encoded tokens after the tokenization and encoding phase and the latent embeddings before the classification layer, respectively.

Theorems & Definitions (7)

Example 1
Example 2
Example 3
Example 4
Example 5
Example 6
Definition 1: Domain-Invariant Language Models

Parameter Choice and Neuro-Symbolic Approaches for Deep Domain-Invariant Learning

TL;DR

Abstract

Parameter Choice and Neuro-Symbolic Approaches for Deep Domain-Invariant Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (7)