Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

Nick McGreivy; Ammar Hakim

Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

Nick McGreivy, Ammar Hakim

TL;DR

It is concluded that ML-for-PDE-solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to under-reporting of negative results.

Abstract

One of the most promising applications of machine learning (ML) in computational physics is to accelerate the solution of partial differential equations (PDEs). The key objective of ML-based PDE solvers is to output a sufficiently accurate solution faster than standard numerical methods, which are used as a baseline comparison. We first perform a systematic review of the ML-for-PDE solving literature. Of articles that use ML to solve a fluid-related PDE and claim to outperform a standard numerical method, we determine that 79% (60/76) compare to a weak baseline. Second, we find evidence that reporting biases, especially outcome reporting bias and publication bias, are widespread. We conclude that ML-for-PDE solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to underreporting of negative results. To a large extent, these issues appear to be caused by factors similar to those of past reproducibility crises: researcher degrees of freedom and a bias towards positive results. We call for bottom-up cultural changes to minimize biased reporting as well as top-down structural reforms intended to reduce perverse incentives for doing so.

Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

TL;DR

It is concluded that ML-for-PDE-solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to under-reporting of negative results.

Abstract

Paper Structure (15 sections, 1 equation, 1 figure, 1 table)

This paper contains 15 sections, 1 equation, 1 figure, 1 table.

Introduction
Weak baselines
Reporting biases
Limitations
Discussion
Methods
Systematic review
Inclusion criteria
Exclusion criteria
Search process
Criteria for evaluating baselines
Details of stronger baselines in Table \ref{['tab:reproducebaselines']}
Random sample of ML-for-PDE articles
Random sample of PINN articles
Declarations

Figures (1)

Figure 1: The cumulative effects of weak baselines and reporting biases on samples A and B. Each circle or hexagon represents an article, while each color represents the result of comparing the relative speed and accuracy to a standard numerical method. In (a) we estimate what the results would be with strong baselines and without outcome reporting bias. (b) shows what the results would likely be without outcome reporting bias. (c) shows the results in the published literature.

Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

TL;DR

Abstract

Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

Authors

TL;DR

Abstract

Table of Contents

Figures (1)