Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/48882Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | BEKKERS, Liese | - |
| dc.contributor.author | NEVEN, Frank | - |
| dc.contributor.author | Pantelis, Lorrens | - |
| dc.contributor.author | VANSUMMEREN, Stijn | - |
| dc.date.accessioned | 2026-04-09T13:36:55Z | - |
| dc.date.available | 2026-04-09T13:36:55Z | - |
| dc.date.issued | 2026 | - |
| dc.date.submitted | 2026-03-17T07:21:32Z | - |
| dc.identifier.uri | http://hdl.handle.net/1942/48882 | - |
| dc.description.abstract | We introduce the problem of Poisson sampling over joins: compute a sample of the result of a join query by conceptually performing a Bernoulli trial for each join tuple, using a non-uniform and tuple-specific probability. We propose an algorithm for Poisson sampling over acyclic joins that is nearly instance-optimal, running in time O(N + k \log N) where N is the size of the input database, and k is the size of the resulting sample. Our algorithm hinges on two building blocks: (1) The construction of a random-access index that allows, given a number i, to randomly access the i-th join tuple without fully materializing the (possibly large) join result; (2) The probing of this index to construct the result sample. We study the engineering trade-offs required to make both components practical, focusing on their implementation in column stores, and identify best-performing alternatives for both. Our experiments on real-world data demonstrate that this pair of alternatives significantly outperforms the repeated-Bernoulli-trial algorithm for Poisson sampling while also demonstrating that the random-access index by itself can be used to competively implement Yannakakis' acyclic join processing algorithm when no sampling is required. This shows that, as far a query engine design is concerned, it is possible to adopt a uniform basis for both classical acyclic join processing and Poisson sampling, both without regret compared to classical join and sampling algorithms. | - |
| dc.language.iso | en | - |
| dc.publisher | Association for Computing Machinery (ACM) | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. ©2026 Copyright held by the owner/author(s). | - |
| dc.subject.other | Computer Science - Databases | - |
| dc.title | Poisson Sampling over Acyclic Joins | - |
| dc.type | Preprint | - |
| local.bibliographicCitation.conferencename | ACM SIGMOD/PODS Conference | - |
| local.bibliographicCitation.conferenceplace | Bengaluru, India | - |
| dc.identifier.issue | 3 | - |
| dc.identifier.volume | 4 | - |
| local.bibliographicCitation.jcat | O | - |
| local.type.refereed | Refereed | - |
| local.type.specified | Preprint | - |
| local.bibliographicCitation.artnr | 224 | - |
| dc.identifier.doi | 10.1145/3802101 | - |
| dc.identifier.arxiv | https://arxiv.org/abs/2603.10982 | - |
| dc.description.other | ACMReference Format: Liese Bekkers, Frank Neven, Lorrens Pantelis, and Stijn Vansummeren. 2026. Poisson Sampling over Acyclic Joins. Proc. ACM Manag. Data 4, 3 (SIGMOD), Article 224 (June 2026), 34 pages. https://doi.org/10.1145/3802101 | - |
| local.provider.type | - | |
| local.uhasselt.international | no | - |
| item.fullcitation | BEKKERS, Liese; NEVEN, Frank; Pantelis, Lorrens & VANSUMMEREN, Stijn (2026) Poisson Sampling over Acyclic Joins. | - |
| item.contributor | BEKKERS, Liese | - |
| item.contributor | NEVEN, Frank | - |
| item.contributor | Pantelis, Lorrens | - |
| item.contributor | VANSUMMEREN, Stijn | - |
| item.accessRights | Open Access | - |
| item.fulltext | With Fulltext | - |
| crisitem.journal.eissn | 2836-6573 | - |
| Appears in Collections: | Research publications | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 224-bekkers.pdf | Published version | 851.99 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.