Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/45094
Title: Measuring Approximate Functional Dependencies: A Comparative Study
Authors: PARCIAK, Marcel 
WEYTJENS, Sebastiaan 
HENS, Niel 
NEVEN, Frank 
PEETERS, Liesbet 
VANSUMMEREN, Stijn 
Issue Date: 2024
Publisher: IEEE
Source: 40th IEEE International Conference on Data Engineering, IEEE, p. 3505 -3518
Abstract: Approximate functional dependencies (AFDs) are functional dependencies (FDs) that “almost” hold in a relation. While various measures have been proposed to quantify the level to which an FD holds approximately, they are difficult to compare and it is unclear which measure is preferable when one needs to discover FDs in real-world data, i.e., data that only approximately satisfies the FD. In response, this paper formally and qualitatively compares AFD measures. We obtain a formal comparison through a novel presentation of measures in terms of Shannon and logical entropy. Qualitatively, we perform a sensitivity analysis w.r.t. structural properties of input relations and quantitatively study the effectiveness of AFD measures for ranking AFDs on real world data. Based on this analysis, we give clear recommendations for the AFD measures to use in practice.
Keywords: functional dependencies;data cleaning;data profiling
Document URI: http://hdl.handle.net/1942/45094
Link to publication/dataset: http://arxiv.org/abs/2312.06296v1
DOI: https://doi.org/10.1109/ICDE60146.2024.00270
Rights: 2024 IEEE
Category: C1
Type: Proceedings Paper
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
2312.06296v1.pdf
  Restricted Access
Published version429.75 kBAdobe PDFView/Open    Request a copy
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.