The Semiotic Channel Principle: Measuring the Capacity for Meaning in LLM Communication
Davide Picca
TL;DR
The paper reframes LLM-mediated communication as a semiotic process, modeling meaning transmission with a pair of information-theoretic measures: semiotic breadth $S$ (source entropy) and decipherability $D$ (mutual information between messages and interpretations), both governed by a generative complexity parameter $\lambda$ within a semiotic channel defined by audience $\mathcal{A}$ and context $\mathrm{Ctx}$. It introduces a capacity concept $\mathcal{C}_{\mathcal{A},\mathrm{Ctx}} = \max_{\lambda} D_{\mathcal{A},\mathrm{Ctx}}(\lambda)$ subject to $D(\lambda) \le S(\lambda)$, enabling empirical measurement of $S$ and $D$ from observable artifacts. The framework yields practical applications in model profiling, prompt/context design, interpretive risk analysis, and adaptive human–machine systems, supported by measurement protocols and theoretical safeguards such as Fano’s inequality for error bounds in goal-oriented tasks. While offering a rigorous toolkit for understanding and designing LLM communication, it also discusses methodological and theoretical limitations, including proxy-based measurements and the dynamic nature of interpretation across contexts. Overall, the work provides a novel bridge between semiotics and AI, shifting focus from opaque internals to observable signs and interpretations that can be quantitatively managed in real-world settings.
Abstract
This paper proposes a novel semiotic framework for analyzing Large Language Models (LLMs), conceptualizing them as stochastic semiotic engines whose outputs demand active, asymmetric human interpretation. We formalize the trade-off between expressive richness (semiotic breadth) and interpretive stability (decipherability) using information-theoretic tools. Breadth is quantified as source entropy, and decipherability as the mutual information between messages and human interpretations. We introduce a generative complexity parameter (lambda) that governs this trade-off, as both breadth and decipherability are functions of lambda. The core trade-off is modeled as an emergent property of their distinct responses to $λ$. We define a semiotic channel, parameterized by audience and context, and posit a capacity constraint on meaning transmission, operationally defined as the maximum decipherability by optimizing lambda. This reframing shifts analysis from opaque model internals to observable textual artifacts, enabling empirical measurement of breadth and decipherability. We demonstrate the framework's utility across four key applications: (i) model profiling; (ii) optimizing prompt/context design; (iii) risk analysis based on ambiguity; and (iv) adaptive semiotic systems. We conclude that this capacity-based semiotic approach offers a rigorous, actionable toolkit for understanding, evaluating, and designing LLM-mediated communication.
