NOVA: Discovering Well-Conditioned Winograd Transforms through Numerical Optimization of Vandermonde Arithmetic
Jayant Lohia
TL;DR
NOVA tackles the problem of numerical instability in Winograd convolution under low-precision arithmetic by reframing interpolation-point selection as a continuous optimization problem. An Evolution Strategy search over the finite n-1 interpolation points, followed by snap-to-rational enforcement and exact symbolic verification, discovers well-conditioned fractional points outside traditional vocabularies, dramatically improving Vandermonde conditioning. These gains propagate to the full Winograd transform stack, yielding substantial reductions in 1D and 2D conditioning (up to 172,484×) and restoring high accuracy in FP16 ImageNet inference without retraining. The framework delivers drop-in, latency-free transform replacements and demonstrates robust performance across multiple architectures, with additional insights on dtype-aware (dyadic) discovery for hardware-friendly deployment. The work positions fractional point discovery as a practical, broadly applicable route to enable large-tile Winograd on modern hardware, paving the way for more efficient, energy-conscious CNN inference.
Abstract
Winograd convolution is the standard algorithm for efficient inference, reducing arithmetic complexity by 2.25x for 3x3 kernels. However, it faces a critical barrier in the modern era of low precision computing: numerical instability. As tiles scale to maximize efficiency (e.g., F(6,3), F(8,3)), the condition numbers of standard integer based transforms explode, reaching kappa = 2 x 10^5 for F(8,3), rendering them unusable in FP16 or Int8. We introduce NOVA (Numerical Optimization of Vandermonde Arithmetic), a discovery framework that breaks the decades old convention of integer interpolation. Treating Winograd point selection as a continuous optimization problem, NOVA searches the manifold R^n-1 via Evolution Strategy, snaps candidates to simple rationals, and guarantees correctness via symbolic verification. This process uncovers a hidden landscape of stable, fractional configurations such as {+-5/6, +-7/6, +-3/5} that defy traditional vocabulary constraints. The impact is transformative: NOVA improves the conditioning of F(8,3) by 415x in 1D, which squares to a 172,484x improvement for 2D convolution. In real world FP16 ImageNet inference, where standard transforms collapse to random chance (e.g., 4.7 percent accuracy on VGG16), NOVA's points restore full accuracy (75 to 78 percent), recovering over 70 percentage points without retraining, calibration, or learned parameters. These discovered transforms act as drop in replacements, effectively unlocking the efficiency of large tile Winograd convolution for next generation hardware.
