Skip to main content

Research Repository

Advanced Search

Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems

Cheng, L.; Kotoulas, S.; Ward, T.; Theodoropoulos, G.

Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems Thumbnail


Authors

L. Cheng

S. Kotoulas

T. Ward

G. Theodoropoulos



Abstract

The performance of joins in parallel database management systems is critical for data intensive operations such as querying. Since data skew is common in many applications, poorly engineered join operations result in load imbalance and performance bottlenecks. State-of-the-art methods designed to handle this problem offer significant improvements over naive implementations. However, performance could be further improved by removing the dependency on global skew knowledge and broadcasting. In this paper, we propose PRPQ (partial redistribution & partial query), an efficient and robust join algorithm for processing large-scale joins over distributed systems. We present the detailed implementation and a quantitative evaluation of our method. The experimental results demonstrate that the proposed PRPQ algorithm is indeed robust and scalable under a wide range of skew conditions. Specifically, compared to the state-of-art PRPD method, we achieve 16% - 167% performance improvement and 24% - 54% less network communication under different join workloads.

Citation

Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014). Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems. In CIKM'14 : proceedings of the 23rd ACM International Conference on Information and Knowledge Management : November 3-7, 2014, Shanghai, China (1399-1408). https://doi.org/10.1145/2661829.2661888

Conference Name 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14
Conference Location Shanghai, China
Start Date Nov 3, 2014
End Date Nov 7, 2014
Publication Date Nov 3, 2014
Deposit Date Apr 21, 2016
Publicly Available Date Mar 28, 2024
Pages 1399-1408
Book Title CIKM'14 : proceedings of the 23rd ACM International Conference on Information and Knowledge Management : November 3-7, 2014, Shanghai, China.
DOI https://doi.org/10.1145/2661829.2661888

Files

Accepted Conference Proceeding (409 Kb)
PDF

Copyright Statement
© 2014 ACM. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Long Cheng, Spyros Kotoulas, Tomas E. Ward, and Georgios Theodoropoulos. 2014. Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM '14). ACM, New York, NY, USA, 1399-1408. DOI=http://dx.doi.org/10.1145/2661829.2661888




You might also like



Downloadable Citations