Vectoring Languages

Joseph Chen

Vectoring Languages

Joseph Chen

TL;DR

The paper addresses the challenge of reconciling philosophical theories of language with modern AI language models by proposing a vectoring framework that treats language as a high-dimensional vector space $V_L$ and uses projections to extract interpretable attribute subspaces. It formalizes definitions, contrasts vectoring with practical vectoring, and connects to existing theories (e.g., Word2Vec, Transformers) through a linear-algebra-inspired lens. It argues that meanings reside in structured projections like $W_{meaning}$ and introduces a taxonomy to map words to subspace coordinates via functions $F$. The work aims to guide future research and experimental exploration by uniting philosophy with rapidly advancing AI models to accelerate scientific progress, while acknowledging current limitations and proposing directions for agent-based updating and multi-perspective integration.

Abstract

Recent breakthroughs in large language models (LLM) have stirred up global attention, and the research has been accelerating non-stop since then. Philosophers and psychologists have also been researching the structure of language for decades, but they are having a hard time finding a theory that directly benefits from the breakthroughs of LLMs. In this article, we propose a novel structure of language that reflects well on the mechanisms behind language models and go on to show that this structure is also better at capturing the diverse nature of language compared to previous methods. An analogy of linear algebra is adapted to strengthen the basis of this perspective. We further argue about the difference between this perspective and the design philosophy for current language models. Lastly, we discuss how this perspective can lead us to research directions that may accelerate the improvements of science fastest.

Vectoring Languages

TL;DR

and uses projections to extract interpretable attribute subspaces. It formalizes definitions, contrasts vectoring with practical vectoring, and connects to existing theories (e.g., Word2Vec, Transformers) through a linear-algebra-inspired lens. It argues that meanings reside in structured projections like

and introduces a taxonomy to map words to subspace coordinates via functions

. The work aims to guide future research and experimental exploration by uniting philosophy with rapidly advancing AI models to accelerate scientific progress, while acknowledging current limitations and proposing directions for agent-based updating and multi-perspective integration.

Abstract

Paper Structure (19 sections)

This paper contains 19 sections.

Introduction
Related Works
Word vector representations as a theory
Word representations in Machines
The danger of LLMs
Vectoring
Practical vectoring
Word2Vec
Transformers
But how does a machine learning model learn meanings from data?
Differences between vectoring and practical vectoring
Taxonomy and Definition
Response to Word Meaning in Minds and Machines
Word Representations Should Support Describing a Perceptually Present Scenario or Understanding Such a Description
Word Representations Should Support Choosing Words on the Basis of Internal Desires, Goals, or Plans
...and 4 more sections

Vectoring Languages

TL;DR

Abstract

Vectoring Languages

Authors

TL;DR

Abstract

Table of Contents