Cookies

We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.


Durham Research Online
You are in:

Hardware-aware block size tailoring on adaptive spacetree grids for shallow water waves.

Weinzierl, Tobias and Wittmann, Roland and Unterweger, Kristof and Bader, Michael and Breuer, Alexander and Rettenberger, Sebastian (2014) 'Hardware-aware block size tailoring on adaptive spacetree grids for shallow water waves.', in HiStencils 2014 - Proceedings of the 1st international workshop on high-performance stencil computations. , pp. 57-64. HiPEAC.

Abstract

Spacetrees are a popular formalism to describe dynamically adaptive Cartesian grids. Though they directly yield an adaptive spatial discretisation, i.e. a mesh, it is often more efficient to augment them by regular Cartesian blocks embedded into the spacetree leaves. This facilitates stencil kernels working efficiently on homogeneous data chunks. The choice of a proper block size, however, is delicate. While large block sizes foster simple loop parallelism, vectorisation, and lead to branch-free compute kernels, they bring along disadvantages. Large blocks restrict the granularity of adaptivity and hence increase the memory footprint and lower the numerical-accuracy-per-byte efficiency. Large block sizes also reduce the block-level concurrency that can be used for dynamic load balancing. In the present paper, we therefore propose a spacetree-block coupling that can dynamically tailor the block size to the compute characteristics. For that purpose, we allow different block sizes per spacetree node. Groups of blocks of the same size are identied automatically throughout the simulation iterations, and a predictor function triggers the replacement of these blocks by one huge, regularly rened block. This predictor can pick up hardware characteristics while the dynamic adaptivity of the fine grid mesh is not constrained. We study such characteristics with a state-of-the-art shallow water solver and examine proper block size choices on AMD Bulldozer and Intel Sandy Bridge processors.

Item Type:Book chapter
Full text:(VoR) Version of Record
Download PDF
(1352Kb)
Status:Peer-reviewed
Publisher Web site:http://www.exastencils.org/histencils/
Date accepted:No date available
Date deposited:27 February 2014
Date of first online publication:January 2014
Date first made open access:No date available

Save or Share this output

Export:
Export
Look up in GoogleScholar