Table of Contents
Fetching ...

Universal Syntactic Structures: Modeling Syntax for Various Natural Languages

Min K. Kim, Hafu Takero, Sara Fedovik

TL;DR

This work proposes that human language may be governed by universal syntactic structures and introduces the synapper, a multi-dimensional looping representation that unifies the six canonical word orders through a direction-of-flow mechanism. By treating syntax as a closed-loop system of interconnected tokens, the approach aims to explain rapid sentence formation, cross-language translation, and recursion, while offering a brain-inspired alternative to current UD and neural MT frameworks. The paper presents case studies and analyses, including a Korean–English translation scenario and a critique of BLEU-based evaluation, arguing that syntax-centric evaluation may better capture translation quality. It discusses implications for universal grammar, critical period learning, and AI, suggesting that synapper could inform models that approach human-like language processing and potentially contribute to advances toward artificial general intelligence. Overall, the work advocates deeper exploration of universal syntactic structures and their applications in AI, translation, and cognitive science, while calling for rigorous scrutiny and broader validation.

Abstract

We aim to provide an explanation for how the human brain might connect words for sentence formation. A novel approach to modeling syntactic representation is introduced, potentially showing the existence of universal syntactic structures for all natural languages. As the discovery of DNA's double helix structure shed light on the inner workings of genetics, we wish to introduce a basic understanding of how language might work in the human brain. It could be the brain's way of encoding and decoding knowledge. It also brings some insight into theories in linguistics, psychology, and cognitive science. After looking into the logic behind universal syntactic structures and the methodology of the modeling technique, we attempt to analyze corpora that showcase universality in the language process of different natural languages such as English and Korean. Lastly, we discuss the critical period hypothesis, universal grammar, and a few other assertions on language for the purpose of advancing our understanding of the human brain.

Universal Syntactic Structures: Modeling Syntax for Various Natural Languages

TL;DR

This work proposes that human language may be governed by universal syntactic structures and introduces the synapper, a multi-dimensional looping representation that unifies the six canonical word orders through a direction-of-flow mechanism. By treating syntax as a closed-loop system of interconnected tokens, the approach aims to explain rapid sentence formation, cross-language translation, and recursion, while offering a brain-inspired alternative to current UD and neural MT frameworks. The paper presents case studies and analyses, including a Korean–English translation scenario and a critique of BLEU-based evaluation, arguing that syntax-centric evaluation may better capture translation quality. It discusses implications for universal grammar, critical period learning, and AI, suggesting that synapper could inform models that approach human-like language processing and potentially contribute to advances toward artificial general intelligence. Overall, the work advocates deeper exploration of universal syntactic structures and their applications in AI, translation, and cognitive science, while calling for rigorous scrutiny and broader validation.

Abstract

We aim to provide an explanation for how the human brain might connect words for sentence formation. A novel approach to modeling syntactic representation is introduced, potentially showing the existence of universal syntactic structures for all natural languages. As the discovery of DNA's double helix structure shed light on the inner workings of genetics, we wish to introduce a basic understanding of how language might work in the human brain. It could be the brain's way of encoding and decoding knowledge. It also brings some insight into theories in linguistics, psychology, and cognitive science. After looking into the logic behind universal syntactic structures and the methodology of the modeling technique, we attempt to analyze corpora that showcase universality in the language process of different natural languages such as English and Korean. Lastly, we discuss the critical period hypothesis, universal grammar, and a few other assertions on language for the purpose of advancing our understanding of the human brain.
Paper Structure (11 sections, 9 figures)

This paper contains 11 sections, 9 figures.

Figures (9)

  • Figure 1: All the six word orders belong to one structure by moving the three components either clockwise (SVO, VOS, OSV) or counterclockwise (SOV, OVS, VSO).
  • Figure 2: The starting constituent for English is underlined (Jane). In SVO languages, the sentence is read clockwise starting with the subject. The branch words that are connected to the node horse are read with the far-left word first (a, very, fast, brown). In some languages like French and Spanish, some branch words are supposed to be read after the node (a, horse, brown, very, fast).
  • Figure 3: For SOV languages, the sentence is read counterclockwise starting with Tim (Tim, the, hospital, to, going, is).
  • Figure 4: The subject phrase exists in multiple loops or layers. They all have the same direction of flow (either clockwise or counterclockwise) for the given sentence.
  • Figure 5: What appears to be the same structure linearly shows up as two distinct synapper models. (The loops connecting the beginning and the end of the sentences are removed for simplification.)
  • ...and 4 more figures