Publication details for Dr Tobias WeinzierlCharrier, Dominic E. , Hazelwood, Benjamin & Weinzierl, Tobias (2020). Enclave Tasking for Discontinuous Galerkin Methods on Dynamically Adaptive Meshes. SIAM Journal on Scientific Computing (SISC) 42(3): C69-C96.
- Publication type: Journal Article
- ISSN/ISBN: 1064-8275, 1095-7197
- DOI: 10.1137/19M1276194
- Further publication details on publisher web site
- Durham Research Online (DRO) - may include full text
- View in another repository - may include full text
Author(s) from Durham
High-order discontinuous Galerkin (DG) methods promise to be an excellent discretization paradigm for hyperbolic differential equation solvers running on supercomputers, since they combine high arithmetic intensity with localized data access, since they straightforwardly translate into nonoverlapping domain decomposition, and since they facilitate dynamic adaptivity without the need for conformal meshes. An efficient parallel evaluation of DG weak formulation in an MPI+X setting, however, remains nontrivial as dependency graphs over dynamically adaptive meshes change with each mesh refinement or coarsening, as resolution transitions yield nontrivial data flow dependencies, and as data sent along domain boundaries through message passing (MPI) have to be triggered in the correct order. Domain decomposition with MPI alone starts to become insufficient if the mesh changes very frequently, if mesh changes cannot be predicted, and if limiters and nonlinear per-cell solves yield unpredictable costs per cell. We introduce enclave tasking as a task invocation technique for shared memory and MPI+X: It does not assemble any task graph; instead the mesh traversal spawns ready tasks directly. A marker-and-cell approach ensures that tasks feeding into MPI or triggering mesh modifications as well as latency-sensitive or bandwidth-demanding tasks are processed with high priority. The remaining cell tasks form enclaves, i.e., groups of tasks that can be processed in the background. Enclave tasking introduces high concurrency which is homogeneously distributed over the mesh traversal, it mixes memory-intensive volumetric DG calculations with compute-bound Riemann solves, and it helps to overlap communication with computations. Our work focuses on ADER-DG and patch-based finite volumes. Yet, we discuss how the paradigm can be generalized to the whole DG family and finite volume stand-alone solvers.