Table of Contents
Fetching ...

SPEC CPU2026: Characterization, Representativeness, and Cross-Suite Comparison

Ruihao Li, Andrew Jacob, Neeraja J. Yadwadkar, Lizy K. John

Abstract

Specialized accelerators dominate AI workloads, but CPUs remain critical for orchestrating these accelerators and running datacenter services. As a result, CPU performance increasingly shapes end-to-end system efficiency, making it necessary for benchmarks to reflect modern workloads and bottlenecks. However, it remains unclear how emerging CPU benchmark suites reflect these shifts. To address this, we present the first comprehensive characterization of SPEC CPU2026 across nine platforms spanning recent Intel, AMD, Ampere, and Nvidia processors. We find that, compared to SPEC CPU2017, SPEC CPU2026 increases instruction volume and memory footprint, and shifts pressure toward emerging bottlenecks, most notably higher instruction-cache stress. We next examine whether the full suite is necessary for architectural evaluation. Using clustering-based representativeness analysis, we identify that compact subsets of 4-5 workloads per group preserve 96.4-99.9% of full-suite behavior, substantially reducing evaluation costs without sacrificing fidelity. To better position SPEC CPU2026, we compare it against SPEC CPU2017, DCPerf, and MLPerf using cross-suite microarchitectural metrics. SPEC CPU2026 remains a general-purpose suite with complementary characteristics: it is less vector-intensive than MLPerf and has lower frontend pressure than DCPerf, yet moves closer to real-world CPU behavior than prior SPEC CPU generations. Finally, we show that SPEC CPU2026 supports practical architectural studies beyond aggregate scores through case studies on page sizes and allocators, prefetching, compiler optimizations, ISA sensitivity, and many-core scaling. The new round-robin stagger mode generates proxy workloads that approximate DCPerf, reducing the IPC gap to 13.7%. Overall, SPEC CPU2026 sets a new foundation for rigorous and cost-effective CPU evaluation.

SPEC CPU2026: Characterization, Representativeness, and Cross-Suite Comparison

Abstract

Specialized accelerators dominate AI workloads, but CPUs remain critical for orchestrating these accelerators and running datacenter services. As a result, CPU performance increasingly shapes end-to-end system efficiency, making it necessary for benchmarks to reflect modern workloads and bottlenecks. However, it remains unclear how emerging CPU benchmark suites reflect these shifts. To address this, we present the first comprehensive characterization of SPEC CPU2026 across nine platforms spanning recent Intel, AMD, Ampere, and Nvidia processors. We find that, compared to SPEC CPU2017, SPEC CPU2026 increases instruction volume and memory footprint, and shifts pressure toward emerging bottlenecks, most notably higher instruction-cache stress. We next examine whether the full suite is necessary for architectural evaluation. Using clustering-based representativeness analysis, we identify that compact subsets of 4-5 workloads per group preserve 96.4-99.9% of full-suite behavior, substantially reducing evaluation costs without sacrificing fidelity. To better position SPEC CPU2026, we compare it against SPEC CPU2017, DCPerf, and MLPerf using cross-suite microarchitectural metrics. SPEC CPU2026 remains a general-purpose suite with complementary characteristics: it is less vector-intensive than MLPerf and has lower frontend pressure than DCPerf, yet moves closer to real-world CPU behavior than prior SPEC CPU generations. Finally, we show that SPEC CPU2026 supports practical architectural studies beyond aggregate scores through case studies on page sizes and allocators, prefetching, compiler optimizations, ISA sensitivity, and many-core scaling. The new round-robin stagger mode generates proxy workloads that approximate DCPerf, reducing the IPC gap to 13.7%. Overall, SPEC CPU2026 sets a new foundation for rigorous and cost-effective CPU evaluation.

Paper Structure

This paper contains 21 sections, 1 equation, 36 figures, 8 tables.

Figures (36)

  • Figure 1: Key performance metric comparison between SPEC CPU26 and CPU17. SPEC CPU26 broadens L1I$ miss coverage.
  • Figure 2: SPEC CPU26 INT Rate.
  • Figure 3: SPEC CPU26 FP Rate.
  • Figure 4: SPEC CPU26 INT Speed.
  • Figure 5: SPEC CPU26 FP Speed.
  • ...and 31 more figures