Table of Contents
Fetching ...

Reproducibility: The New Frontier in AI Governance

Israel Mason-Williams, Gabryel Mason-Williams

TL;DR

The paper argues that AI governance hinges on a high Signal-To-Noise Ratio in research outputs, which is currently eroded by weak reproducibility. By learning from reproducibility crises in Economics, Cancer Biology, and Psychology, it identifies preregistration, increased statistical power, and negative-result reporting as key mechanisms to improve policy-relevant AI science. It outlines concrete mitigation strategies for AI—preregistration, statistical leverage, and negative-result reporting—and discusses how their adoption could reduce information asymmetries and mitigate regulatory capture. The work emphasizes a theory-of-change where stronger reproducibility protocols empower policymakers with clearer assessments of AI capabilities and risks, albeit with trade-offs in research speed and compute requirements.

Abstract

AI policymakers are responsible for delivering effective governance mechanisms that can provide safe, aligned and trustworthy AI development. However, the information environment offered to policymakers is characterised by an unnecessarily low Signal-To-Noise Ratio, favouring regulatory capture and creating deep uncertainty and divides on which risks should be prioritised from a governance perspective. We posit that the current publication speeds in AI combined with the lack of strong scientific standards, via weak reproducibility protocols, effectively erodes the power of policymakers to enact meaningful policy and governance protocols. Our paper outlines how AI research could adopt stricter reproducibility guidelines to assist governance endeavours and improve consensus on the AI risk landscape. We evaluate the forthcoming reproducibility crisis within AI research through the lens of crises in other scientific domains; providing a commentary on how adopting preregistration, increased statistical power and negative result publication reproducibility protocols can enable effective AI governance. While we maintain that AI governance must be reactive due to AI's significant societal implications we argue that policymakers and governments must consider reproducibility protocols as a core tool in the governance arsenal and demand higher standards for AI research. Code to replicate data and figures: https://github.com/IFMW01/reproducibility-the-new-frontier-in-ai-governance

Reproducibility: The New Frontier in AI Governance

TL;DR

The paper argues that AI governance hinges on a high Signal-To-Noise Ratio in research outputs, which is currently eroded by weak reproducibility. By learning from reproducibility crises in Economics, Cancer Biology, and Psychology, it identifies preregistration, increased statistical power, and negative-result reporting as key mechanisms to improve policy-relevant AI science. It outlines concrete mitigation strategies for AI—preregistration, statistical leverage, and negative-result reporting—and discusses how their adoption could reduce information asymmetries and mitigate regulatory capture. The work emphasizes a theory-of-change where stronger reproducibility protocols empower policymakers with clearer assessments of AI capabilities and risks, albeit with trade-offs in research speed and compute requirements.

Abstract

AI policymakers are responsible for delivering effective governance mechanisms that can provide safe, aligned and trustworthy AI development. However, the information environment offered to policymakers is characterised by an unnecessarily low Signal-To-Noise Ratio, favouring regulatory capture and creating deep uncertainty and divides on which risks should be prioritised from a governance perspective. We posit that the current publication speeds in AI combined with the lack of strong scientific standards, via weak reproducibility protocols, effectively erodes the power of policymakers to enact meaningful policy and governance protocols. Our paper outlines how AI research could adopt stricter reproducibility guidelines to assist governance endeavours and improve consensus on the AI risk landscape. We evaluate the forthcoming reproducibility crisis within AI research through the lens of crises in other scientific domains; providing a commentary on how adopting preregistration, increased statistical power and negative result publication reproducibility protocols can enable effective AI governance. While we maintain that AI governance must be reactive due to AI's significant societal implications we argue that policymakers and governments must consider reproducibility protocols as a core tool in the governance arsenal and demand higher standards for AI research. Code to replicate data and figures: https://github.com/IFMW01/reproducibility-the-new-frontier-in-ai-governance

Paper Structure

This paper contains 20 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Indicative plot of the speed of publication versus reproducibility standards in scientific domains with average publication trajectories ($\uparrow$) over the last five years. Please see Appendix Section \ref{['sec:Speed_Reproduce']} for the methodology used to produce this plot.
  • Figure 2: Number of NeurIPS publications that mention GitHub between 2019-2024. We use GitHub mentions as a proxy for replicability. The motivations and limiations of this approach are described in Appendix Section \ref{['sec:GitHub_NeurIPS']}
  • Figure 3: Preposed impact of reproducibility protocols.
  • Figure 4: Blown up plot of publication speed versus reproducibility standards.
  • Figure 5: Reproducibility: Theory of Change.