We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

GPU fast convolution via the overlap-and-save method in shared memory.

Adámek, Karel and Dimoudi, Sofia and Giles, Mike and Armour, Wesley (2020) 'GPU fast convolution via the overlap-and-save method in shared memory.', ACM transactions on architecture and code optimization., 17 (3). p. 18.


We present an implementation of the overlap-and-save method, a method for the convolution of very long signals with short response functions, which is tailored to GPUs. We have implemented several FFT algorithms (using the CUDA programming language), which exploit GPU shared memory, allowing for GPU accelerated convolution. We compare our implementation with an implementation of the overlap-and-save algorithm utilizing the NVIDIA FFT library (cuFFT). We demonstrate that by using a shared-memory-based FFT, we can achieved significant speed-ups for certain problem sizes and lower the memory requirements of the overlap-and-save method on GPUs.

Item Type:Article
Additional Information:PubHub#
Full text:(VoR) Version of Record
Available under License - Creative Commons Attribution.
Download PDF
Publisher Web site:
Publisher statement:This work is licensed under a Creative Commons Attribution International 4.0 License.
Date accepted:30 April 2020
Date deposited:18 November 2020
Date of first online publication:31 August 2020
Date first made open access:18 November 2020

Save or Share this output

Look up in GoogleScholar