Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/23351
Full metadata record
DC FieldValueLanguage
dc.contributor.authorAMELOOT, Tom-
dc.contributor.authorGECK, Gaetano-
dc.contributor.authorKETSMAN, Bas-
dc.contributor.authorNEVEN, Frank-
dc.contributor.authorSchwentick, Thomas-
dc.date.accessioned2017-03-15T08:15:44Z-
dc.date.available2017-03-15T08:15:44Z-
dc.date.issued2017-
dc.identifier.citationCommunications of the ACM, 60(3), p. 93-100-
dc.identifier.issn0001-0782-
dc.identifier.urihttp://hdl.handle.net/1942/23351-
dc.description.abstractEvaluating queries over massive amounts of data is a major challenge in the big data era. Modern massively parallel systems, like e.g. Spark, organize query answering as a sequence of rounds each consisting of a distinct communication phase followed by a computation phase. The communication phase redistributes data over the available servers, while in the subsequent computation phase each server performs the actual computation on its local data. There is a growing interest in single-round algorithms for evaluating multiway joins where data is first reshuffled over the servers and then evaluated in a parallel but communication-free way. As the amount of communication induced by a reshuffling of the data is a dominating cost in such systems, we introduce a framework for reasoning about data partitioning to detect when we can avoid the data reshuffling step. Specifically, we formalize the decision problems parallel-correctness and transfer of parallel-correctness, provide semantical characterizations, and obtain tight complexity bounds.-
dc.language.isoen-
dc.titleReasoning on data partitioning for single-round multi-join evaluation in massively parallel systems-
dc.typeJournal Contribution-
dc.identifier.epage100-
dc.identifier.issue3-
dc.identifier.spage93-
dc.identifier.volume60-
local.bibliographicCitation.jcatA1-
dc.description.notesAmeloot, TJ (reprint author), Hasselt Univ, Hasselt, Belgium. tom.ameloot@uhasselt.be; gaetano.geck@udo.edu; bas.ketsman@uhasselt.be; frank.neven@uhasselt.be; thomas.schwentick@udo.edu-
local.type.refereedRefereed-
local.type.specifiedArticle-
dc.identifier.doi10.1145/3041063-
dc.identifier.isi000396058600024-
item.fulltextWith Fulltext-
item.contributorAMELOOT, Tom-
item.contributorGECK, Gaetano-
item.contributorKETSMAN, Bas-
item.contributorNEVEN, Frank-
item.contributorSchwentick, Thomas-
item.fullcitationAMELOOT, Tom; GECK, Gaetano; KETSMAN, Bas; NEVEN, Frank & Schwentick, Thomas (2017) Reasoning on data partitioning for single-round multi-join evaluation in massively parallel systems. In: Communications of the ACM, 60(3), p. 93-100.-
item.accessRightsOpen Access-
item.validationecoom 2018-
crisitem.journal.issn0001-0782-
crisitem.journal.eissn1557-7317-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
cacm.pdfPeer-reviewed author version333.7 kBAdobe PDFView/Open
p93-ameloot.pdf
  Restricted Access
Published version1.19 MBAdobe PDFView/Open    Request a copy
Show simple item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.