Table of Contents
Fetching ...

Bridging Philosophy and Machine Learning: A Structuralist Framework for Classifying Neural Network Representations

Yildiz Culcu

TL;DR

The paper investigates the philosophical assumptions embedded in neural network representations and interpretability work. It introduces a structuralist decision framework with three criteria—entity elimination, source of structure, and mode of existence—and applies a modified PRISMA systematic review to literature from the last two decades. Analysis of five influential papers reveals a dominant structural idealism stance and a striking absence of structural realism, highlighting tensions in interpretability and epistemic trust. The framework thus provides a rigorous, interdisciplinary foundation for future collaboration between philosophy of science and machine learning to clarify what representations purport to be and how we should reason about them.

Abstract

Machine learning models increasingly function as representational systems, yet the philosoph- ical assumptions underlying their internal structures remain largely unexamined. This paper develops a structuralist decision framework for classifying the implicit ontological commitments made in machine learning research on neural network representations. Using a modified PRISMA protocol, a systematic review of the last two decades of literature on representation learning and interpretability is conducted. Five influential papers are analysed through three hierarchical criteria derived from structuralist philosophy of science: entity elimination, source of structure, and mode of existence. The results reveal a pronounced tendency toward structural idealism, where learned representations are treated as model-dependent constructions shaped by architec- ture, data priors, and training dynamics. Eliminative and non-eliminative structuralist stances appear selectively, while structural realism is notably absent. The proposed framework clarifies conceptual tensions in debates on interpretability, emergence, and epistemic trust in machine learning, and offers a rigorous foundation for future interdisciplinary work between philosophy of science and machine learning.

Bridging Philosophy and Machine Learning: A Structuralist Framework for Classifying Neural Network Representations

TL;DR

The paper investigates the philosophical assumptions embedded in neural network representations and interpretability work. It introduces a structuralist decision framework with three criteria—entity elimination, source of structure, and mode of existence—and applies a modified PRISMA systematic review to literature from the last two decades. Analysis of five influential papers reveals a dominant structural idealism stance and a striking absence of structural realism, highlighting tensions in interpretability and epistemic trust. The framework thus provides a rigorous, interdisciplinary foundation for future collaboration between philosophy of science and machine learning to clarify what representations purport to be and how we should reason about them.

Abstract

Machine learning models increasingly function as representational systems, yet the philosoph- ical assumptions underlying their internal structures remain largely unexamined. This paper develops a structuralist decision framework for classifying the implicit ontological commitments made in machine learning research on neural network representations. Using a modified PRISMA protocol, a systematic review of the last two decades of literature on representation learning and interpretability is conducted. Five influential papers are analysed through three hierarchical criteria derived from structuralist philosophy of science: entity elimination, source of structure, and mode of existence. The results reveal a pronounced tendency toward structural idealism, where learned representations are treated as model-dependent constructions shaped by architec- ture, data priors, and training dynamics. Eliminative and non-eliminative structuralist stances appear selectively, while structural realism is notably absent. The proposed framework clarifies conceptual tensions in debates on interpretability, emergence, and epistemic trust in machine learning, and offers a rigorous foundation for future interdisciplinary work between philosophy of science and machine learning.

Paper Structure

This paper contains 22 sections, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Decision tree for classifying structuralist positions in neural network representation papers.