Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/23351
Title: Reasoning on data partitioning for single-round multi-join evaluation in massively parallel systems
Authors: AMELOOT, Tom 
GECK, Gaetano 
KETSMAN, Bas 
NEVEN, Frank 
Schwentick, Thomas
Issue Date: 2017
Source: Communications of the ACM, 60(3), p. 93-100
Abstract: Evaluating queries over massive amounts of data is a major challenge in the big data era. Modern massively parallel systems, like e.g. Spark, organize query answering as a sequence of rounds each consisting of a distinct communication phase followed by a computation phase. The communication phase redistributes data over the available servers, while in the subsequent computation phase each server performs the actual computation on its local data. There is a growing interest in single-round algorithms for evaluating multiway joins where data is first reshuffled over the servers and then evaluated in a parallel but communication-free way. As the amount of communication induced by a reshuffling of the data is a dominating cost in such systems, we introduce a framework for reasoning about data partitioning to detect when we can avoid the data reshuffling step. Specifically, we formalize the decision problems parallel-correctness and transfer of parallel-correctness, provide semantical characterizations, and obtain tight complexity bounds.
Notes: Ameloot, TJ (reprint author), Hasselt Univ, Hasselt, Belgium. tom.ameloot@uhasselt.be; gaetano.geck@udo.edu; bas.ketsman@uhasselt.be; frank.neven@uhasselt.be; thomas.schwentick@udo.edu
Document URI: http://hdl.handle.net/1942/23351
ISSN: 0001-0782
e-ISSN: 1557-7317
DOI: 10.1145/3041063
ISI #: 000396058600024
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
cacm.pdfPeer-reviewed author version333.7 kBAdobe PDFView/Open
p93-ameloot.pdf
  Restricted Access
Published version1.19 MBAdobe PDFView/Open    Request a copy
Show full item record

SCOPUSTM   
Citations

3
checked on Sep 2, 2020

WEB OF SCIENCETM
Citations

3
checked on Dec 2, 2022

Page view(s)

82
checked on Sep 5, 2022

Download(s)

204
checked on Sep 5, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.