Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/46103
Title: Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores
Authors: BEKKERS, Liese 
NEVEN, Frank 
VANSUMMEREN, Stijn 
Wang, Yisu Remy
Issue Date: 2025
Publisher: VLDB Endowment
Source: Proceedings of the Vldb Endowment, 18 (8)
Status: In press
Abstract: Acyclic join queries can be evaluated instance-optimally using Yannakakis' algorithm, which avoids needlessly large intermediate results through semi-join passes. Recent work proposes to address the significant hidden constant factors arising from a naive implementation of Yannakakis by decomposing the hash join operator into two suboperators, called Lookup and Expand. In this paper, we present a novel method for integrating Lookup and Expand plans in interpreted environments, like column stores, formalizing them using Nested Semijoin Algebra (NSA) and implementing them through a shredding approach. We characterize the class of NSA expressions that can be evaluated instance-optimally as those that are 2-phase: no `shrinking' operator is applied after an unnest (i.e., expand). We introduce Shredded Yannakakis (SYA), an evaluation algorithm for acyclic joins that, starting from a binary join plan, transforms it into a 2-phase NSA plan, and then evaluates it through the shredding technique. We show that SYA is provably robust (i.e., never produces large intermediate results) and without regret (i.e., is never worse than the binary join plan under a suitable cost model) on the class of well-behaved binary join plans. Our experiments on a suite of 1,849 queries show that SYA improves performance for 88.7% of the queries with speedups up to 188x, while remaining competitive on the other queries. We hope this approach offers a fresh perspective on Yannakakis' algorithm, helping system engineers better understand its practical benefits and facilitating its adoption into a broader spectrum of query engines.
Keywords: Computer Science - Databases;H.2
Document URI: http://hdl.handle.net/1942/46103
Link to publication/dataset: https://arxiv.org/abs/2411.04042v2
ISSN: 2150-8097
e-ISSN: 2150-8097
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
main.pdf
  Restricted Access
Peer-reviewed author version1.31 MBAdobe PDFView/Open    Request a copy
Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.