Table of Contents
Fetching ...

Non-Determinism and the Lawlessness of Machine Learning Code

A. Feder Cooper, Jonathan Frankle, Christopher De Sa

TL;DR

The paper addresses how non-determinism in ML challenges traditional law-focused analyses that rely on single-outcome accuracy. It distinguishes non-determinism from stochasticity, and advocates a distributional reasoning framework over possible outputs to capture harms and arbitrariness beyond individual decisions. Key contributions include precise definitions, a distributional perspective on credit-scoring and model patterns, and the argument that ML code is effectively lawless under the standard cyberlaw frame, with implications for regulation and robustness. Practically, it urges legal and regulatory frameworks to incorporate distributional insights to better connect systemic harms with individual outcomes and to guide robust ML deployment.

Abstract

Legal literature on machine learning (ML) tends to focus on harms, and thus tends to reason about individual model outcomes and summary error rates. This focus has masked important aspects of ML that are rooted in its reliance on randomness -- namely, stochasticity and non-determinism. While some recent work has begun to reason about the relationship between stochasticity and arbitrariness in legal contexts, the role of non-determinism more broadly remains unexamined. In this paper, we clarify the overlap and differences between these two concepts, and show that the effects of non-determinism, and consequently its implications for the law, become clearer from the perspective of reasoning about ML outputs as distributions over possible outcomes. This distributional viewpoint accounts for randomness by emphasizing the possible outcomes of ML. Importantly, this type of reasoning is not exclusive with current legal reasoning; it complements (and in fact can strengthen) analyses concerning individual, concrete outcomes for specific automated decisions. By illuminating the important role of non-determinism, we demonstrate that ML code falls outside of the cyberlaw frame of treating ``code as law,'' as this frame assumes that code is deterministic. We conclude with a brief discussion of what work ML can do to constrain the potentially harm-inducing effects of non-determinism, and we indicate where the law must do work to bridge the gap between its current individual-outcome focus and the distributional approach that we recommend.

Non-Determinism and the Lawlessness of Machine Learning Code

TL;DR

The paper addresses how non-determinism in ML challenges traditional law-focused analyses that rely on single-outcome accuracy. It distinguishes non-determinism from stochasticity, and advocates a distributional reasoning framework over possible outputs to capture harms and arbitrariness beyond individual decisions. Key contributions include precise definitions, a distributional perspective on credit-scoring and model patterns, and the argument that ML code is effectively lawless under the standard cyberlaw frame, with implications for regulation and robustness. Practically, it urges legal and regulatory frameworks to incorporate distributional insights to better connect systemic harms with individual outcomes and to guide robust ML deployment.

Abstract

Legal literature on machine learning (ML) tends to focus on harms, and thus tends to reason about individual model outcomes and summary error rates. This focus has masked important aspects of ML that are rooted in its reliance on randomness -- namely, stochasticity and non-determinism. While some recent work has begun to reason about the relationship between stochasticity and arbitrariness in legal contexts, the role of non-determinism more broadly remains unexamined. In this paper, we clarify the overlap and differences between these two concepts, and show that the effects of non-determinism, and consequently its implications for the law, become clearer from the perspective of reasoning about ML outputs as distributions over possible outcomes. This distributional viewpoint accounts for randomness by emphasizing the possible outcomes of ML. Importantly, this type of reasoning is not exclusive with current legal reasoning; it complements (and in fact can strengthen) analyses concerning individual, concrete outcomes for specific automated decisions. By illuminating the important role of non-determinism, we demonstrate that ML code falls outside of the cyberlaw frame of treating ``code as law,'' as this frame assumes that code is deterministic. We conclude with a brief discussion of what work ML can do to constrain the potentially harm-inducing effects of non-determinism, and we indicate where the law must do work to bridge the gap between its current individual-outcome focus and the distributional approach that we recommend.
Paper Structure (7 sections, 2 figures)

This paper contains 7 sections, 2 figures.

Figures (2)

  • Figure 1: Synthetic probability distributions for possible predicted credit scores of two different individuals.
  • Figure 2: Synthetic patterns of model outcomes for two models trained on the same training data for the same task, using the same algorithm and data, but possibly different computers with different hardware random seeds. Non-determinism in the training process yields different patterns of model outcomes. We visualize this pattern like a probability distribution, but there are no guarantees that we can reason reliably about this pattern with the tools of probability.

Theorems & Definitions (2)

  • definition 1
  • definition 2