Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/45094
Title: | Measuring Approximate Functional Dependencies: A Comparative Study | Authors: | PARCIAK, Marcel WEYTJENS, Sebastiaan HENS, Niel NEVEN, Frank PEETERS, Liesbet VANSUMMEREN, Stijn |
Issue Date: | 2024 | Publisher: | IEEE | Source: | 40th IEEE International Conference on Data Engineering, IEEE, p. 3505 -3518 | Abstract: | Approximate functional dependencies (AFDs) are functional dependencies (FDs) that “almost” hold in a relation. While various measures have been proposed to quantify the level to which an FD holds approximately, they are difficult to compare and it is unclear which measure is preferable when one needs to discover FDs in real-world data, i.e., data that only approximately satisfies the FD. In response, this paper formally and qualitatively compares AFD measures. We obtain a formal comparison through a novel presentation of measures in terms of Shannon and logical entropy. Qualitatively, we perform a sensitivity analysis w.r.t. structural properties of input relations and quantitatively study the effectiveness of AFD measures for ranking AFDs on real world data. Based on this analysis, we give clear recommendations for the AFD measures to use in practice. | Keywords: | functional dependencies;data cleaning;data profiling | Document URI: | http://hdl.handle.net/1942/45094 | Link to publication/dataset: | http://arxiv.org/abs/2312.06296v1 | DOI: | https://doi.org/10.1109/ICDE60146.2024.00270 | Rights: | 2024 IEEE | Category: | C1 | Type: | Proceedings Paper |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2312.06296v1.pdf Restricted Access | Published version | 429.75 kB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.