We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

Upscaling ExaHyPE – on each and every core

Li, Baojiu and Schulz, Holger and Tuft, Adam and Weinzierl, Tobias and Zhang, Han (2023) 'Upscaling ExaHyPE – on each and every core.', Technical Report. ARCHER2.


We study a MPI+multithreaded PDE solver for hyperbolic partial differential equations. Each thread per rank handles a subdomain of the computational domain identified by a segment of a space-filling curve. The threads spawn additional tasks which should be used to compensate for ill-balancing between the threads running in fork-join mode. Our studies show that this tasks-over-BSP paradigm is not properly supported in some OpenMP runtimes, leads to NUMA pollution and is vulnerable to tiny tasks. It also suffers from many memory movements. Once we replace user data with smart pointers and hence avoid unnecessary copying, we propose to add a NUMA-aware queuing system on top of OpenMP, to batch multiple tasks into meta tasks which can spread out over idle cores. Many of these techniques are fixes to current OpenMP runtime implementations and we expect them to become unnecessary as the OpenMP runtimes evolve. The insights thus have pathfinding character.

Item Type:Monograph (Technical Report)
Full text:(VoR) Version of Record
Available under License - Creative Commons Attribution Non-commercial No Derivatives 4.0.
Download PDF
Publisher Web site:
Date accepted:No date available
Date deposited:05 May 2023
Date of first online publication:02 May 2023
Date first made open access:05 May 2023

Save or Share this output

Look up in GoogleScholar