Table of Contents
Fetching ...

Impact of Data-Oriented and Object-Oriented Design on Performance and Cache Utilization with Artificial Intelligence Algorithms in Multi-Threaded CPUs

Gabriel M. Arantes, Richard F. Pinto, Bruno L. Dalmazo, Eduardo N. Borges, Giancarlo Lucca, Viviane L. D. de Mattos, Fabian C. Cardoso, Rafael A. Berri

TL;DR

The paper investigates how Data-Oriented Design (DOD) versus Object-Oriented Design (OOD) affects performance and cache utilization on multi-core CPUs using four A* implementations. By combining cache-aware data layouts (AoS vs SoA) with multi-threading in a unified study, it quantifies execution time, memory usage, and cache misses for both paradigms. The results show DOD generally reduces cache misses and speeds up multi-threaded runs, though single-threaded implementations can outperform multi-threaded ones for fine-grained tasks due to threading overhead. Overall, the work argues that DOD provides a more hardware-efficient approach for complex, data-intensive AI and parallel computing tasks.

Abstract

The growing performance gap between multi-core CPUs and main memory necessitates hardware-aware software design paradigms. This study provides a comprehensive performance analysis of Data Oriented Design (DOD) versus the traditional Object-Oriented Design (OOD), focusing on cache utilization and efficiency in multi-threaded environments. We developed and compared four distinct versions of the A* search algorithm: single-threaded OOD (ST-OOD), single-threaded DOD (ST-DOD), multi-threaded OOD (MT-OOD), and multi-threaded DOD (MT-DOD). The evaluation was based on metrics including execution time, memory usage, and CPU cache misses. In multi-threaded tests, the DOD implementation demonstrated considerable performance gains, with faster execution times and a lower number of raw system calls and cache misses. While OOD occasionally showed marginal advantages in memory usage or percentage-based cache miss rates, DOD's efficiency in data-intensive operations was more evident. Furthermore, our findings reveal that for a fine-grained task like the A* algorithm, the overhead associated with thread management led to single-threaded versions significantly outperforming their multi-threaded counterparts in both paradigms. We conclude that even when performance differences appear subtle in simple algorithms, the consistent advantages of DOD in critical metrics highlight its foundational architectural superiority, suggesting it is a more effective approach for maximizing hardware efficiency in complex, large-scale AI and parallel computing tasks.

Impact of Data-Oriented and Object-Oriented Design on Performance and Cache Utilization with Artificial Intelligence Algorithms in Multi-Threaded CPUs

TL;DR

The paper investigates how Data-Oriented Design (DOD) versus Object-Oriented Design (OOD) affects performance and cache utilization on multi-core CPUs using four A* implementations. By combining cache-aware data layouts (AoS vs SoA) with multi-threading in a unified study, it quantifies execution time, memory usage, and cache misses for both paradigms. The results show DOD generally reduces cache misses and speeds up multi-threaded runs, though single-threaded implementations can outperform multi-threaded ones for fine-grained tasks due to threading overhead. Overall, the work argues that DOD provides a more hardware-efficient approach for complex, data-intensive AI and parallel computing tasks.

Abstract

The growing performance gap between multi-core CPUs and main memory necessitates hardware-aware software design paradigms. This study provides a comprehensive performance analysis of Data Oriented Design (DOD) versus the traditional Object-Oriented Design (OOD), focusing on cache utilization and efficiency in multi-threaded environments. We developed and compared four distinct versions of the A* search algorithm: single-threaded OOD (ST-OOD), single-threaded DOD (ST-DOD), multi-threaded OOD (MT-OOD), and multi-threaded DOD (MT-DOD). The evaluation was based on metrics including execution time, memory usage, and CPU cache misses. In multi-threaded tests, the DOD implementation demonstrated considerable performance gains, with faster execution times and a lower number of raw system calls and cache misses. While OOD occasionally showed marginal advantages in memory usage or percentage-based cache miss rates, DOD's efficiency in data-intensive operations was more evident. Furthermore, our findings reveal that for a fine-grained task like the A* algorithm, the overhead associated with thread management led to single-threaded versions significantly outperforming their multi-threaded counterparts in both paradigms. We conclude that even when performance differences appear subtle in simple algorithms, the consistent advantages of DOD in critical metrics highlight its foundational architectural superiority, suggesting it is a more effective approach for maximizing hardware efficiency in complex, large-scale AI and parallel computing tasks.

Paper Structure

This paper contains 16 sections, 1 equation, 11 figures, 1 table.

Figures (11)

  • Figure 1: Example of the A* algorithm in execution
  • Figure 2: Execution time graph generated on Ubuntu system.
  • Figure 3: Memory usage graph for the test conducted on Ubuntu system.
  • Figure 5: Cache misses [%] graph generated on the Ubuntu system.
  • Figure 6: Graph of raw cache misses generated in the Ubuntu system.
  • ...and 6 more figures