Testing the Machine Consciousness Hypothesis
Stephen Fitz
TL;DR
The paper addresses whether consciousness can arise in machines within a substrate-free, functionalist framework. It proposes an in silico program where distributed predictive agents atop a universal cellular automaton develop collective self-models through lossy communication, using an information-geometric and topological toolkit to diagnose consciousness-like coherence. Key contributions include a concrete architecture of transformer-based cortical columns on a simple substrate, formal metrics for integration, reflexivity, temporality, and causal efficacy, and a design space linking substrate dynamics to collective selfhood. The work aims to shift consciousness research toward empirically testable, substrate-agnostic mechanisms with broad interdisciplinary implications.
Abstract
The Machine Consciousness Hypothesis states that consciousness is a substrate-free functional property of computational systems capable of second-order perception. I propose a research program to investigate this idea in silico by studying how collective self-models (coherent, self-referential representations) emerge from distributed learning systems embedded within universal self-organizing environments. The theory outlined here starts from the supposition that consciousness is an emergent property of collective intelligence systems undergoing synchronization of prediction through communication. It is not an epiphenomenon of individual modeling but a property of the language that a system evolves to internally describe itself. For a model of base reality, I begin with a minimal but general computational world: a cellular automaton, which exhibits both computational irreducibility and local reducibility. On top of this computational substrate, I introduce a network of local, predictive, representational (neural) models capable of communication and adaptation. I use this layered model to study how collective intelligence gives rise to self-representation as a direct consequence of inter-agent alignment. I suggest that consciousness does not emerge from modeling per se, but from communication. It arises from the noisy, lossy exchange of predictive messages between groups of local observers describing persistent patterns in the underlying computational substrate (base reality). It is through this representational dialogue that a shared model arises, aligning many partial views of the world. The broader goal is to develop empirically testable theories of machine consciousness, by studying how internal self-models may form in distributed systems without centralized control.
