Table of Contents
Fetching ...

Statistical complexity of software systems represented as multi-layer networks

Jan Žižka

TL;DR

The paper addresses the lack of empirical tools to quantify software system complexity and proposes statistical complexity as an empirical measure for software systems modeled as multi-layer networks. It formalizes the Statistical Complexity Measure (SCM) as SCM = H \cdot Q, where H is the Shannon entropy of the system-state distribution and Q is the Jensen-Shannon disequilibrium relative to a uniform reference, with normalization for cross-size comparisons. It validates SCM through OMNeT++ simulations of ordered, layered, and chaotic topologies across system sizes from 128 to 1024 components and times up to 500 seconds, showing SCM behavior consistent with theory and identifying that layered configurations can maximize complexity. It discusses limitations in interpreting disequilibrium for engineered software and outlines future work on applying SCM to real systems, integrating SCM into design optimization and anomaly detection.

Abstract

Software systems are expansive, exhibiting behaviors characteristic of complex systems, such as self-organization and emergence. These systems, highlighted by advancements in Large Language Models (LLMs) and other AI applications developed by entities like DeepMind and OpenAI showcase remarkable properties. Despite these advancements, there is a notable absence of effective tools for empirically measuring software system complexity, hindering our ability to compare these systems or assess the impact of modifications on their properties. Addressing this gap, we propose the adoption of statistical complexity, a metric already applied in fields such as physics, biology, and economics, as an empirical measure for evaluating the complexity of software systems. Our approach involves calculating the statistical complexity of software systems modeled as multi-layer networks validated by simulations and theoretical comparisons. This measure offers insights into the organizational structure of software systems, exhibits promising consistency with theoretical expectations, and paves the way for leveraging statistical complexity as a tool to deepen our understanding of complex software systems and into their plausible and unplausible emergent behaviors.

Statistical complexity of software systems represented as multi-layer networks

TL;DR

The paper addresses the lack of empirical tools to quantify software system complexity and proposes statistical complexity as an empirical measure for software systems modeled as multi-layer networks. It formalizes the Statistical Complexity Measure (SCM) as SCM = H \cdot Q, where H is the Shannon entropy of the system-state distribution and Q is the Jensen-Shannon disequilibrium relative to a uniform reference, with normalization for cross-size comparisons. It validates SCM through OMNeT++ simulations of ordered, layered, and chaotic topologies across system sizes from 128 to 1024 components and times up to 500 seconds, showing SCM behavior consistent with theory and identifying that layered configurations can maximize complexity. It discusses limitations in interpreting disequilibrium for engineered software and outlines future work on applying SCM to real systems, integrating SCM into design optimization and anomaly detection.

Abstract

Software systems are expansive, exhibiting behaviors characteristic of complex systems, such as self-organization and emergence. These systems, highlighted by advancements in Large Language Models (LLMs) and other AI applications developed by entities like DeepMind and OpenAI showcase remarkable properties. Despite these advancements, there is a notable absence of effective tools for empirically measuring software system complexity, hindering our ability to compare these systems or assess the impact of modifications on their properties. Addressing this gap, we propose the adoption of statistical complexity, a metric already applied in fields such as physics, biology, and economics, as an empirical measure for evaluating the complexity of software systems. Our approach involves calculating the statistical complexity of software systems modeled as multi-layer networks validated by simulations and theoretical comparisons. This measure offers insights into the organizational structure of software systems, exhibits promising consistency with theoretical expectations, and paves the way for leveraging statistical complexity as a tool to deepen our understanding of complex software systems and into their plausible and unplausible emergent behaviors.

Paper Structure

This paper contains 4 sections, 7 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Sketch of the intuitive notion of the magnitudes of “information” (H) and “disequilibrium” (Q) for the physical systems and the behavior intuitively required for the magnitude of “complexity.” The quantity SCM = H·Q is proposed to measure such a magnitude. LopezRui1995