Local Adjoints for Simultaneous Preaccumulations with Shared Inputs
Johannes Blühdorn, Nicolas R. Gauger
TL;DR
This work proposes different vector- and map-based approaches for storing local adjoint variables and analyzes them with respect to memory consumption, memory allocation, and adjoint variable access times in the context of simultaneous preaccumulations in multiple threads.
Abstract
In shared-memory parallel automatic differentiation, inputs that are shared among simultaneous thread-local preaccumulations lead to data races if Jacobians are accumulated with a single, shared vector of adjoint variables. In this work, we discuss the benefits and tradeoffs of re-enabling such preaccumulations by a transition to suitable local adjoints. We propose different vector- and map-based approaches for storing local adjoint variables and analyze them with respect to memory consumption, memory allocation, and adjoint variable access times in the context of simultaneous preaccumulations in multiple threads. We implement the approaches in CoDiPack and benchmark them in parallel discrete adjoint computations in the multiphysics simulation suite SU2.
