Table of Contents
Fetching ...

Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

Nick McGreivy, Ammar Hakim

TL;DR

It is concluded that ML-for-PDE-solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to under-reporting of negative results.

Abstract

One of the most promising applications of machine learning (ML) in computational physics is to accelerate the solution of partial differential equations (PDEs). The key objective of ML-based PDE solvers is to output a sufficiently accurate solution faster than standard numerical methods, which are used as a baseline comparison. We first perform a systematic review of the ML-for-PDE solving literature. Of articles that use ML to solve a fluid-related PDE and claim to outperform a standard numerical method, we determine that 79% (60/76) compare to a weak baseline. Second, we find evidence that reporting biases, especially outcome reporting bias and publication bias, are widespread. We conclude that ML-for-PDE solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to underreporting of negative results. To a large extent, these issues appear to be caused by factors similar to those of past reproducibility crises: researcher degrees of freedom and a bias towards positive results. We call for bottom-up cultural changes to minimize biased reporting as well as top-down structural reforms intended to reduce perverse incentives for doing so.

Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

TL;DR

It is concluded that ML-for-PDE-solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to under-reporting of negative results.

Abstract

One of the most promising applications of machine learning (ML) in computational physics is to accelerate the solution of partial differential equations (PDEs). The key objective of ML-based PDE solvers is to output a sufficiently accurate solution faster than standard numerical methods, which are used as a baseline comparison. We first perform a systematic review of the ML-for-PDE solving literature. Of articles that use ML to solve a fluid-related PDE and claim to outperform a standard numerical method, we determine that 79% (60/76) compare to a weak baseline. Second, we find evidence that reporting biases, especially outcome reporting bias and publication bias, are widespread. We conclude that ML-for-PDE solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to underreporting of negative results. To a large extent, these issues appear to be caused by factors similar to those of past reproducibility crises: researcher degrees of freedom and a bias towards positive results. We call for bottom-up cultural changes to minimize biased reporting as well as top-down structural reforms intended to reduce perverse incentives for doing so.
Paper Structure (15 sections, 1 equation, 1 figure, 1 table)

This paper contains 15 sections, 1 equation, 1 figure, 1 table.

Figures (1)

  • Figure 1: The cumulative effects of weak baselines and reporting biases on samples A and B. Each circle or hexagon represents an article, while each color represents the result of comparing the relative speed and accuracy to a standard numerical method. In (a) we estimate what the results would be with strong baselines and without outcome reporting bias. (b) shows what the results would likely be without outcome reporting bias. (c) shows the results in the published literature.