Table of Contents
Fetching ...

A Mathematics-Guided Approach to Floating-Point Error Detection

Youshuai Tan, Zhanwei Zhang, Zishuo Ding, Lianyu Zheng, Jinfu Chen, Weiyi Shang

TL;DR

This paper tackles the problem of reliably detecting inputs that cause substantial floating-point errors in programs. It introduces MGDE, a mathematically guided method that reframes detection as a root-finding task and leverages the Newton-Raphson method for fast, long-range convergence while avoiding heavy high-precision error computations. A PI-detector is used to prune false positives, and extensive evaluation on the FPCC dataset shows MGDE significantly outperforms the prior state of the art in both bug discovery and runtime efficiency, including multi-input scenarios. The results indicate MGDE’s practical value for robust floating-point software testing and its potential to guide future mathematically grounded detection methods.

Abstract

Floating-point program errors can lead to severe consequences, particularly in critical domains such as military applications. Only a small subset of inputs may induce substantial floating-point errors, prompting researchers to develop methods for identifying these error-inducing inputs. Although existing approaches have achieved some success, they still suffer from two major limitations: (1) High computational cost: The evaluation of error magnitude for candidate inputs relies on high-precision programs, which are prohibitively time-consuming. (2) Limited long-range convergence capability: Current methods exhibit inefficiency in search, making the process akin to finding a needle in a haystack. To address these two limitations, we propose a novel method, named MGDE, to detect error-inducing inputs based on mathematical guidance. By employing the Newton-Raphson method, which exhibits quadratic convergence properties, we achieve highly effective and efficient results. Since the goal of identifying error-inducing inputs is to uncover the underlying bugs, we use the number of bugs detected in floating-point programs as the primary evaluation metric in our experiments. As FPCC represents the most effective state-of-the-art approach to date, we use it as the baseline for comparison. The dataset of FPCC consists of 88 single-input floating-point programs. FPCC is able to detect 48 bugs across 29 programs, whereas our method successfully identifies 89 bugs across 44 programs. Moreover, FPCC takes 6.4096 times as long as our proposed method. We also deploy our method to multi-input programs, identifying a total of nine bugs with an average detection time of 0.6443 seconds per program. In contrast, FPCC fails to detect any bugs while requiring an average computation time of 100 seconds per program.

A Mathematics-Guided Approach to Floating-Point Error Detection

TL;DR

This paper tackles the problem of reliably detecting inputs that cause substantial floating-point errors in programs. It introduces MGDE, a mathematically guided method that reframes detection as a root-finding task and leverages the Newton-Raphson method for fast, long-range convergence while avoiding heavy high-precision error computations. A PI-detector is used to prune false positives, and extensive evaluation on the FPCC dataset shows MGDE significantly outperforms the prior state of the art in both bug discovery and runtime efficiency, including multi-input scenarios. The results indicate MGDE’s practical value for robust floating-point software testing and its potential to guide future mathematically grounded detection methods.

Abstract

Floating-point program errors can lead to severe consequences, particularly in critical domains such as military applications. Only a small subset of inputs may induce substantial floating-point errors, prompting researchers to develop methods for identifying these error-inducing inputs. Although existing approaches have achieved some success, they still suffer from two major limitations: (1) High computational cost: The evaluation of error magnitude for candidate inputs relies on high-precision programs, which are prohibitively time-consuming. (2) Limited long-range convergence capability: Current methods exhibit inefficiency in search, making the process akin to finding a needle in a haystack. To address these two limitations, we propose a novel method, named MGDE, to detect error-inducing inputs based on mathematical guidance. By employing the Newton-Raphson method, which exhibits quadratic convergence properties, we achieve highly effective and efficient results. Since the goal of identifying error-inducing inputs is to uncover the underlying bugs, we use the number of bugs detected in floating-point programs as the primary evaluation metric in our experiments. As FPCC represents the most effective state-of-the-art approach to date, we use it as the baseline for comparison. The dataset of FPCC consists of 88 single-input floating-point programs. FPCC is able to detect 48 bugs across 29 programs, whereas our method successfully identifies 89 bugs across 44 programs. Moreover, FPCC takes 6.4096 times as long as our proposed method. We also deploy our method to multi-input programs, identifying a total of nine bugs with an average detection time of 0.6443 seconds per program. In contrast, FPCC fails to detect any bugs while requiring an average computation time of 100 seconds per program.

Paper Structure

This paper contains 20 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Geometric illustration of the Newton's method.
  • Figure 2: Detection process of FPCC on gsl_sf_airy_Ai_deriv_e. Black triangles denote the fitness values of the true error-inducing inputs missed by FPCC. Colored circles correspond to the fitness values of inputs evaluated by FPCC, with each color indicating a distinct region defined by FPCC’s partitioning scheme.
  • Figure 3: An overview of MGDE.
  • Figure 4:
  • Figure 5: Detection performance of MGDE on gsl_sf_airy_Ai_deriv_e. The red dots on the coordinate axis represent that error-inducing inputs can be found using these points as start points, with the red arrows indicating the locations of the identified inputs and their corresponding relative errors. The blue dots denote cases where no such inputs could be detected.
  • ...and 1 more figures