Table of Contents
Fetching ...

Neuro-Symbolic Contrastive Learning for Cross-domain Inference

Mingyue Liu, Ryo Ueda, Zhen Wan, Katsumi Inoue, Chris G. Willcocks

TL;DR

This work tackles the issue that pretrained language models struggle with genuine logical inference by combining inductive logic programming with neural networks through a neuro-symbolic contrastive learning framework. The method uses ILP-derived meta-rules to generate hard positive and hard negative pairs, mapping data between logical forms and natural language via LoLA and Grammatical Framework, and optimizes a contrastive loss $\mathcal{L}_{cl}$ to carve a logical structure into the neural embedding space. Empirical results on ILP-inspired NLI datasets show improved cross-domain and cross-form transfer, with logic-form representations sometimes outperforming NL forms in reasoning tasks. The approach advances interpretable, generalizable reasoning for NLP by leveraging symbolic rules to guide differentiable learning and data augmentation.

Abstract

Pre-trained language models (PLMs) have made significant advances in natural language inference (NLI) tasks, however their sensitivity to textual perturbations and dependence on large datasets indicate an over-reliance on shallow heuristics. In contrast, inductive logic programming (ILP) excels at inferring logical relationships across diverse, sparse and limited datasets, but its discrete nature requires the inputs to be precisely specified, which limits their application. This paper proposes a bridge between the two approaches: neuro-symbolic contrastive learning. This allows for smooth and differentiable optimisation that improves logical accuracy across an otherwise discrete, noisy, and sparse topological space of logical functions. We show that abstract logical relationships can be effectively embedded within a neuro-symbolic paradigm, by representing data as logic programs and sets of logic rules. The embedding space captures highly varied textual information with similar semantic logical relations, but can also separate similar textual relations that have dissimilar logical relations. Experimental results demonstrate that our approach significantly improves the inference capabilities of the models in terms of generalisation and reasoning.

Neuro-Symbolic Contrastive Learning for Cross-domain Inference

TL;DR

This work tackles the issue that pretrained language models struggle with genuine logical inference by combining inductive logic programming with neural networks through a neuro-symbolic contrastive learning framework. The method uses ILP-derived meta-rules to generate hard positive and hard negative pairs, mapping data between logical forms and natural language via LoLA and Grammatical Framework, and optimizes a contrastive loss to carve a logical structure into the neural embedding space. Empirical results on ILP-inspired NLI datasets show improved cross-domain and cross-form transfer, with logic-form representations sometimes outperforming NL forms in reasoning tasks. The approach advances interpretable, generalizable reasoning for NLP by leveraging symbolic rules to guide differentiable learning and data augmentation.

Abstract

Pre-trained language models (PLMs) have made significant advances in natural language inference (NLI) tasks, however their sensitivity to textual perturbations and dependence on large datasets indicate an over-reliance on shallow heuristics. In contrast, inductive logic programming (ILP) excels at inferring logical relationships across diverse, sparse and limited datasets, but its discrete nature requires the inputs to be precisely specified, which limits their application. This paper proposes a bridge between the two approaches: neuro-symbolic contrastive learning. This allows for smooth and differentiable optimisation that improves logical accuracy across an otherwise discrete, noisy, and sparse topological space of logical functions. We show that abstract logical relationships can be effectively embedded within a neuro-symbolic paradigm, by representing data as logic programs and sets of logic rules. The embedding space captures highly varied textual information with similar semantic logical relations, but can also separate similar textual relations that have dissimilar logical relations. Experimental results demonstrate that our approach significantly improves the inference capabilities of the models in terms of generalisation and reasoning.

Paper Structure

This paper contains 21 sections, 14 equations, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: Logical data is discrete and sparse (red bars) and difficult to directly model (left blue curve) by a differentiable neural network $f_\theta$. However, we map meta-rules to-and-from the smooth PLM embedding space and utilise contrastive pairs (vertical arrows) to carve the sharp underlying logical structure (rightmost blue function) into $f_\theta$, enabling logical generalisation and logical reasoning.
  • Figure 2: Illustration of an anchor data point $E = (P,L)$ with its corresponding positive and negative pairs. The positive pair $E^+ = (P^+, L^+)$ maintains logical consistency with the anchor, while the negative pair $E^- = (P^-, L^-)$ introduces a logical contradiction despite overlapping textual content.
  • Figure 3: A model of the translation system is presented, including an example of translating a First-Order Logic (FOL) formula into English. Each node in the Abstract Syntax Tree (AST) is named after the syntactic function used to construct the corresponding constituent calo2022enhancing. The right side of this figure displays the tree structure following an optimisation step applied to the initial configuration on the left side.