Table of Contents
Fetching ...

Evaluating Numerical Accuracy in Mixed-Precision Computing by Dual-Delta Testing

Peichen Xie

TL;DR

Dual-Delta Testing addresses the challenge of validating numerical accuracy in mixed-precision computing by replacing a single error delta with two error distributions $\Delta_1$ and $\Delta_2$ relative to a high-precision oracle $f_\Omega$. The method formalizes a mathematical framework, presents an algorithm to compute and compare the distributions, and offers statistical tools (descriptive statistics, visualizations, and hypothesis tests) to determine equivalence or superiority of implementations. Through matrix-multiplication case studies, the approach detects both equivalent accuracy and latent numerical issues, and it validates fixes by restoring distributional parity with the oracle. This methodology provides a robust, generalizable protocol for rigorously assessing numerical accuracy across mixed-precision implementations and hardware platforms.

Abstract

Mixed-precision computing has become increasingly important in modern high-performance computing and machine learning applications. When implementing custom mixed-precision functions -- such as fused operators, optimized GPU kernels, or quantized inference paths -- it is critical to verify their numerical accuracy. Traditional approaches typically compare the custom implementation against a reference using a single error metric. However, this single-delta approach provides limited insight into whether the observed errors are inherent to the precision level or specific to the implementation. This paper introduces \textit{Dual-Delta Testing}, a systematic methodology that evaluates two error distributions against a high-precision oracle, enabling rigorous comparison between a custom implementation and a baseline reference. We present the mathematical framework, algorithmic formulation, statistical analysis techniques, and practical examples demonstrating the methodology's effectiveness in evaluating numerical accuracy.

Evaluating Numerical Accuracy in Mixed-Precision Computing by Dual-Delta Testing

TL;DR

Dual-Delta Testing addresses the challenge of validating numerical accuracy in mixed-precision computing by replacing a single error delta with two error distributions and relative to a high-precision oracle . The method formalizes a mathematical framework, presents an algorithm to compute and compare the distributions, and offers statistical tools (descriptive statistics, visualizations, and hypothesis tests) to determine equivalence or superiority of implementations. Through matrix-multiplication case studies, the approach detects both equivalent accuracy and latent numerical issues, and it validates fixes by restoring distributional parity with the oracle. This methodology provides a robust, generalizable protocol for rigorously assessing numerical accuracy across mixed-precision implementations and hardware platforms.

Abstract

Mixed-precision computing has become increasingly important in modern high-performance computing and machine learning applications. When implementing custom mixed-precision functions -- such as fused operators, optimized GPU kernels, or quantized inference paths -- it is critical to verify their numerical accuracy. Traditional approaches typically compare the custom implementation against a reference using a single error metric. However, this single-delta approach provides limited insight into whether the observed errors are inherent to the precision level or specific to the implementation. This paper introduces \textit{Dual-Delta Testing}, a systematic methodology that evaluates two error distributions against a high-precision oracle, enabling rigorous comparison between a custom implementation and a baseline reference. We present the mathematical framework, algorithmic formulation, statistical analysis techniques, and practical examples demonstrating the methodology's effectiveness in evaluating numerical accuracy.
Paper Structure (24 sections, 3 theorems, 4 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 24 sections, 3 theorems, 4 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

If the distributions $\Delta_1$ and $\Delta_2$ are statistically indistinguishable, we conclude that $f_1$ and $f_2$ exhibit comparable numerical accuracy relative to the oracle.

Figures (3)

  • Figure 1: Error distribution comparison for $128 \times 128$ matrix multiplication. The GPU (blue) and CPU (orange) error distributions overlap almost entirely, indicating equivalent numerical accuracy.
  • Figure 2: Error distribution comparison for $128 \times 4096$ by $4096 \times 128$ matrix multiplication. The GPU (blue) and CPU (orange) error distributions are clearly separated, revealing a significant accuracy degradation in the default GPU implementation.
  • Figure 3: Error distribution comparison for $128 \times 4096$ by $4096 \times 128$ matrix multiplication after disabling reduced-precision reduction. The GPU (blue) and CPU (orange) error distributions now overlap almost entirely, confirming restored accuracy parity.

Theorems & Definitions (6)

  • Definition 1: Implementation
  • Definition 2: Oracle
  • Definition 3: Error Metric
  • Proposition 1: Equivalence
  • Proposition 2: Numerical Accuracy
  • Proposition 3: Numerical Stability