What Every Computer Scientist Needs To Know About Parallelization
Temitayo Adefemi
TL;DR
The paper surveys the theory and practice of parallelization, tracing the evolution from PRAM through BSP and LogP to multicore and heterogeneous systems, and contrasting models with practical speedup limits via Amdahl’s and Gustafson’s laws. It maps core paradigms (processes, threads, and the Actor model) and patterns (geometric decomposition, pipelines, recursive data), emphasizing how problem characteristics, size, and hardware shape performance. A central case study on a road traffic simulation demonstrates pattern choices, synchronization, memory considerations, and language-hardware trade-offs, highlighting MPI-based implementations and language performance gaps. The work emphasizes tooling, scalability evaluation, and deployment practices, offering concrete guidance for designing, optimizing, and validating parallel applications across diverse architectures. Overall, it links theory to practice to equip computer scientists with the skills to design robust, scalable, and efficient parallel software in an increasingly concurrent landscape.
Abstract
Parallelization has become a cornerstone of modern computing, influencing everything from high performance supercomputers to everyday mobile devices. This paper presents a comprehensive guide on the fundamentals of parallelization that every computer scientist should know, beginning with a historical perspective that traces the evolution from early theoretical models such as PRAM and BSP to today's advanced multicore and heterogeneous architectures. We explore essential theoretical frameworks, practical paradigms, and synchronization mechanisms while discussing implementation strategies using processes, threads, and modern models like the Actor framework. Additionally, we examine how hardware components including CPUs, caches, memory, and accelerators interact with software to impact performance, scalability, and load balancing. This work demystifies parallel programming by integrating historical context, theoretical underpinnings, and practical case studies. It equips readers with the tools to design, optimize, and troubleshoot parallel applications in an increasingly concurrent computing landscape.
