Charrier, Dominic E. and Weinzierl, Tobias (2018) 'An experience report on (auto-)tuning of mesh-based PDE solvers on shared memory systems.', in Parallel processing and applied mathematics : 12th International Conference, PPAM 2017, Lublin, Poland, September 10-13, 2017 ; revised selected papers. Part I. Cham: Springer, pp. 3-13. Lecture notes in computer science. (10777).
With the advent of manycore systems, shared memory parallelisation has gained importance in high performance computing. Once a code is decomposed into tasks or parallel regions, it becomes crucial to identify reasonable grain sizes, i.e. minimum problem sizes per task that make the algorithm expose a high concurrency at low overhead. Many papers do not detail what reasonable task sizes are, and consider their findings craftsmanship not worth discussion. We have implemented an autotuning algorithm, a machine learning approach, for a project developing a hyperbolic equation system solver. Autotuning here is important as the grid and task workload are multifaceted and change frequently during runtime. In this paper, we summarise our lessons learned. We infer tweaks and idioms for general autotuning algorithms and we clarify that such a approach does not free users completely from grain size awareness.
|Item Type:||Book chapter|
|Keywords:||Autotuning, Shared memory, Grain size, Machine learning.|
|Full text:||(AM) Accepted Manuscript|
First Live Deposit - 22 June 2017
Download PDF (646Kb)
|Publisher Web site:||https://doi.org/10.1007/978-3-319-78054-2_1|
|Publisher statement:||The final publication is available at Springer via https://doi.org/10.1007/978-3-319-78054-2_1|
|Record Created:||22 Jun 2017 09:43|
|Last Modified:||24 Mar 2019 00:56|
|Social bookmarking:||Export: EndNote, Zotero | BibTex|
|Look up in GoogleScholar | Find in a UK Library|