L. Cheng
Design and evaluation of parallel hashing over large-scale data
Cheng, L.; Kotoulas, S.; Ward, T.; Theodoropoulos, G.
Authors
S. Kotoulas
T. Ward
G. Theodoropoulos
Abstract
High-performance analytical data processing systems often run on servers with large amounts of memory. A common data structure used in such environment is the hash tables. This paper focuses on investigating efficient parallel hash algorithms for processing large-scale data. Currently, hash tables on distributed architectures are accessed one key at a time by local or remote threads while shared-memory approaches focus on accessing a single table with multiple threads. A relatively straightforward “bulk-operation” approach seems to have been neglected by researchers. In this work, using such a method, we propose a high-level parallel hashing framework, Structured Parallel Hashing, targeting efficiently processing massive data on distributed memory. We present a theoretical analysis of the proposed method and describe the design of our hashing implementations. The evaluation reveals a very interesting result - the proposed straightforward method can vastly outperform distributed hashing methods and can even offer performance comparable with approaches based on shared memory supercomputers which use specialized hardware predicates. Moreover, we characterize the performance of our hash implementations through extensive experiments, thereby allowing system developers to make a more informed choice for their high-performance applications.
Citation
Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014). Design and evaluation of parallel hashing over large-scale data. In 2014 21st International Conference on High Performance Computing (HiPC 2014) : Velha Goa, India, 17 - 20 December 2014 (1-10). https://doi.org/10.1109/hipc.2014.7116909
Conference Name | 2014 21st International Conference on High Performance Computing (HiPC) |
---|---|
Conference Location | Velha Goa, India |
Start Date | Dec 17, 2014 |
End Date | Dec 20, 2014 |
Publication Date | Dec 20, 2014 |
Deposit Date | Apr 21, 2016 |
Publicly Available Date | Apr 28, 2016 |
Pages | 1-10 |
Series ISSN | 1094-7256 |
Book Title | 2014 21st International Conference on High Performance Computing (HiPC 2014) : Velha Goa, India, 17 - 20 December 2014. |
DOI | https://doi.org/10.1109/hipc.2014.7116909 |
Additional Information | Date of Conference: 17-20 December 2014 |
Files
Accepted Conference Proceeding
(166 Kb)
PDF
Copyright Statement
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Efficient Comparison of Massive Graphs Through The Use Of 'Graph Fingerprints'
(2016)
Conference Proceeding
Towards large-scale what-if traffic simulation with exact-differential simulation
(2015)
Conference Proceeding
Data Quality Assessment and Anomaly Detection Via Map / Reduce and Linked Data: A Case Study in the Medical Domain
(2015)
Conference Proceeding
Fast Compression of Large Semantic Web Data using X10
(2015)
Journal Article
Towards an Info-Symbiotic Decision Support System for Disaster Risk Management
(2015)
Conference Proceeding
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search