Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/23344
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDAENEN, Jonny-
dc.contributor.authorNEVEN, Frank-
dc.contributor.authorTAN, Tony-
dc.contributor.authorVANSUMMEREN, Stijn-
dc.date.accessioned2017-03-14T13:00:01Z-
dc.date.available2017-03-14T13:00:01Z-
dc.date.issued2016-
dc.identifier.citationProceedings of the VLDB Endowmen, 9(10), p. 732-743-
dc.identifier.issn2150-8097-
dc.identifier.urihttp://hdl.handle.net/1942/23344-
dc.description.abstractWhile services such as Amazon AWS make computing power abundantly available, adding more computing nodes can in- cur high costs in, for instance, pay-as-you-go plans while not always significantly improving the net running time (aka wall-clock time) of queries. In this work, we provide algo- rithms for parallel evaluation of SGF queries in MapReduce that optimize total time, while retaining low net time. Not only can SGF queries specify all semi-join reducers, but also more expressive queries involving disjunction and negation. Since SGF queries can be seen as Boolean combinations of (potentially nested) semi-joins, we introduce a novel multi- semi-join (MSJ) MapReduce operator that enables the eval- uation of a set of semi-joins in one job. We use this op- erator to obtain parallel query plans for SGF queries that outvalue sequential plans w.r.t. net time and provide addi- tional optimizations aimed at minimizing total time without severely affecting net time. Even though the latter optimiza- tions are NP-hard, we present effective greedy algorithms. Our experiments, conducted using our own implementation Gumbo on top of Hadoop, confirm the usefulness of parallel query plans, and the effectiveness and scalability of our op- timizations, all with a significant improvement over Pig and Hive.-
dc.description.sponsorshipThe third author was supported in part by grant no. NTU-ERP-105R89082D and the Ministry of Science and Technology Taiwan under grant no. 104-2218-E-002-038. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government – department EWI. We thank Jan Van den Bussche and Jelle Hellings for inspiring discussions and Geert Jan Bex for assistance with cluster setup.-
dc.language.isoen-
dc.rightsCopyright 2016 VLDB Endowment 2150-8097/16/06.-
dc.titleParallel Evaluation of Multi-Semi-Joins-
dc.typeJournal Contribution-
dc.identifier.epage743-
dc.identifier.issue10-
dc.identifier.spage732-
dc.identifier.volume9-
local.bibliographicCitation.jcatA1-
local.type.refereedRefereed-
local.type.specifiedArticle-
local.classdsPublValOverrule/author_version_not_expected-
local.type.programmeVSC-
dc.identifier.doi10.14778/2977797.2977800-
dc.identifier.urlhttp://www.vldb.org/pvldb/vol9/p732-daenen.pdf-
item.fulltextWith Fulltext-
item.contributorDAENEN, Jonny-
item.contributorNEVEN, Frank-
item.contributorTAN, Tony-
item.contributorVANSUMMEREN, Stijn-
item.accessRightsOpen Access-
item.fullcitationDAENEN, Jonny; NEVEN, Frank; TAN, Tony & VANSUMMEREN, Stijn (2016) Parallel Evaluation of Multi-Semi-Joins. In: Proceedings of the VLDB Endowmen, 9(10), p. 732-743.-
crisitem.journal.issn2150-8097-
crisitem.journal.eissn2150-8097-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
p732-daenen.pdfPublished version324.84 kBAdobe PDFView/Open
Show simple item record

SCOPUSTM   
Citations

3
checked on Sep 3, 2020

Page view(s)

104
checked on Sep 6, 2022

Download(s)

200
checked on Sep 6, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.