Karel Adámek
GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory
Adámek, Karel; Dimoudi, Sofia; Giles, Mike; Armour, Wesley
Abstract
We present an implementation of the overlap-and-save method, a method for the convolution of very long signals with short response functions, which is tailored to GPUs. We have implemented several FFT algorithms (using the CUDA programming language), which exploit GPU shared memory, allowing for GPU accelerated convolution. We compare our implementation with an implementation of the overlap-and-save algorithm utilizing the NVIDIA FFT library (cuFFT). We demonstrate that by using a shared-memory-based FFT, we can achieved significant speed-ups for certain problem sizes and lower the memory requirements of the overlap-and-save method on GPUs.
Citation
Adámek, K., Dimoudi, S., Giles, M., & Armour, W. (2020). GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory. ACM Transactions on Architecture and Code Optimization, 17(3), Article 18. https://doi.org/10.1145/3394116
Journal Article Type | Article |
---|---|
Acceptance Date | Apr 30, 2020 |
Online Publication Date | Aug 31, 2020 |
Publication Date | 2020-08 |
Deposit Date | Nov 18, 2020 |
Publicly Available Date | Mar 28, 2024 |
Journal | ACM Transactions on Architecture and Code Optimization |
Print ISSN | 1544-3566 |
Electronic ISSN | 1544-3973 |
Publisher | Association for Computing Machinery (ACM) |
Peer Reviewed | Peer Reviewed |
Volume | 17 |
Issue | 3 |
Article Number | 18 |
DOI | https://doi.org/10.1145/3394116 |
Files
Published Journal Article
(2.4 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
Copyright Statement
This work is licensed under a Creative Commons Attribution International 4.0 License.
You might also like
Bits Missing: Finding Exotic Pulsars Using bfloat16 on NVIDIA GPUs
(2023)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search