On Lexical Invariance on Multisets and Graphs
Muhan Zhang
TL;DR
Addresses invariance under injective lexical transformations for multisets and graphs. Develops necessary and sufficient characterizations of the most expressive lexical invariant functions: $f(x)=f'(c)$ for multisets with $c$ the counts of unique elements, and $f(A,X)=f'(A,D)$ for graphs with $D_{ij}=1$ when $X[i]=X[j]$; proofs rely on canonical forms and permutation invariance. Validates theory with synthetic TU-dataset experiments showing that models respecting lexical invariance generalize better and are more robust to hashed inputs. The work provides a principled foundation for hashing-robust representations in anonymized set- and graph-structured data, with implications for privacy-preserving learning.
Abstract
In this draft, we study a novel problem, called lexical invariance, using the medium of multisets and graphs. Traditionally in the NLP domain, lexical invariance indicates that the semantic meaning of a sentence should remain unchanged regardless of the specific lexical or word-based representation of the input. For example, ``The movie was extremely entertaining'' would have the same meaning as ``The film was very enjoyable''. In this paper, we study a more challenging setting, where the output of a function is invariant to any injective transformation applied to the input lexical space. For example, multiset {1,2,3,2} is equivalent to multiset {a,b,c,b} if we specify an injective transformation that maps 1 to a, 2 to b and 3 to c. We study the sufficient and necessary conditions for a most expressive lexical invariant (and permutation invariant) function on multisets and graphs, and proves that for multisets, the function must have a form that only takes the multiset of counts of the unique elements in the original multiset as input. For example, a most expressive lexical invariant function on {a,b,c,b} must have a form that only operates on {1,1,2} (meaning that there are 1, 1, 2 unique elements corresponding to a,c,b). For graphs, we prove that a most expressive lexical invariant and permutation invariant function must have a form that only takes the adjacency matrix and a difference matrix as input, where the (i,j)th element of the difference matrix is 1 if node i and node j have the same feature and 0 otherwise. We perform synthetic experiments on TU datasets to verify our theorems.
