General-Purpose Multicore Architectures
Saugata Ghose
TL;DR
This work analyzes the shift from ILP-driven single-core CPUs to multicore architectures driven by power density and scaling limits. It surveys multicore microarchitecture, memory hierarchies, coherence protocols, and OS-level optimizations, clarifying how concurrency, cache sharing, and memory scheduling shape performance. It highlights key design trends such as DVFS, cache slicing, SoCs, heterogeneous cores, and chiplet-based designs, and discusses evaluation metrics that capture throughput, fairness, and energy efficiency. The findings emphasize that effective multicore CPUs rely on coordinated hardware and software strategies to manage memory interference, coherence, and heterogeneity, enabling scalable performance across embedded to data-center systems, with ongoing evolution toward modular, energy-aware designs.
Abstract
The first years of the 2000s led to an inflection point in computer architectures: while the number of available transistors on a chip continued to grow, crucial transistor scaling properties started to break down and result in increasing power consumption, while aggressive single-core performance optimizations were resulting in diminishing returns due to inherent limits in instruction-level parallelism. This led to the rise of multicore CPU architectures, which are now commonplace in modern computers at all scales. In this chapter, we discuss the evolution of multicore CPUs since their introduction. Starting with a historic overview of multiprocessing, we explore the basic microarchitecture of a multicore CPU, key challenges resulting from shared memory resources, operating system modifications to optimize multicore CPU support, popular metrics for multicore evaluation, and recent trends in multicore CPU design.
