L. Cheng
Fast Compression of Large Semantic Web Data using X10
Cheng, L.; Avinash, M.; Kotoulas, S.; Ward, T.; Theodoropoulos, G.
Authors
M. Avinash
S. Kotoulas
T. Ward
G. Theodoropoulos
Abstract
The Semantic Web comprises enormous volumes of semi-structured data elements. For interoperability, these elements are represented by long strings. Such representations are not efficient for the purposes of applications that perform computations over large volumes of such information. A common approach to alleviate this problem is through the use of compression methods that produce more compact representations of the data. The use of dictionary encoding is particularly prevalent in Semantic Web database systems for this purpose. However, centralized implementations present performance bottlenecks, giving rise to the need for scalable, efficient distributed encoding schemes. In this paper, we propose an efficient algorithm for fast encoding large Semantic Web data. Specially, we present the detailed implementation of our approach based on the state-of-art asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate performance on a cluster of up to 384 cores and datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art approach, we demonstrate a speed-up of 2:6 7:4 and excellent scalability. In the meantime, these results also illustrate the significant potential of the APGAS model for efficient implementation of dictionary encoding and contributes to the engineering of more efficient, larger scale Semantic Web applications.
Citation
Cheng, L., Avinash, M., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2016). Fast Compression of Large Semantic Web Data using X10. IEEE Transactions on Parallel and Distributed Systems, 27(9), 2603-2617. https://doi.org/10.1109/tpds.2015.2496579
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 25, 2015 |
Online Publication Date | Oct 30, 2015 |
Publication Date | Sep 1, 2016 |
Deposit Date | Apr 21, 2016 |
Publicly Available Date | Apr 21, 2016 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Print ISSN | 1045-9219 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 27 |
Issue | 9 |
Pages | 2603-2617 |
DOI | https://doi.org/10.1109/tpds.2015.2496579 |
Files
Accepted Journal Article
(1.4 Mb)
PDF
Copyright Statement
© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Efficient Comparison of Massive Graphs Through The Use Of 'Graph Fingerprints'
(2016)
Conference Proceeding
Towards large-scale what-if traffic simulation with exact-differential simulation
(2015)
Conference Proceeding
Data Quality Assessment and Anomaly Detection Via Map / Reduce and Linked Data: A Case Study in the Medical Domain
(2015)
Conference Proceeding
Towards an Info-Symbiotic Decision Support System for Disaster Risk Management
(2015)
Conference Proceeding
Exact-Differential Large-Scale Traffic Simulation
(2015)
Conference Proceeding
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search