Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/35321
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorNEVEN, Frank
dc.contributor.authorWEYTJENS, Sebastiaan
dc.date.accessioned2021-09-13T13:06:35Z-
dc.date.available2021-09-13T13:06:35Z-
dc.date.issued2021
dc.identifier.urihttp://hdl.handle.net/1942/35321-
dc.description.abstractMany companies nowadays make use of data to optimize their processes. However, the collected data can contain various inconsistencies due to typing errors, for example. This forces the company to clean the data before deducing insights. One possible solution to discover erroneous information is finding columns that determine other columns, also called Functional Dependencies (FDs). For example, two people that live in the same city have to live in the same country. However, as FDs do not allow errors, we have to find a method to find dependencies that approximately hold in the relation, referred to as Approximate Functional Dependencies (AFDs). This thesis aims to design a relevance-focused tool for domain experts to discover AFDs. We review the existing measures to determine the degree of approximation of an AFD by testing them on various theoretical examples. Based on the findings of these tests, we decide on a combination of measures that focuses on discovering relevant AFDs. Then, we integrate those measures and other AFD metadata into c-metric, a score representing the confidence in a particular AFD. Our extensive experimental evaluation of the c-metric shows that the metric is significantly more suitable for relevant AFD discovery than the existing approximation measures. Finally, to assist domain experts in discovering relevant AFDs, we implement a tool that visualizes our c-metric and other AFD metadata, such as probability distributions.
dc.format.mimetypeApplication/pdf
dc.languagenl
dc.publishertUL
dc.titleApproximate functional dependencies: a comparison of measures and a relevance focused tool for discovery
dc.typeTheses and Dissertations
local.bibliographicCitation.jcatT2
dc.description.notesmaster in de informatica
local.type.specifiedMaster thesis
item.fullcitationWEYTJENS, Sebastiaan (2021) Approximate functional dependencies: a comparison of measures and a relevance focused tool for discovery.-
item.contributorWEYTJENS, Sebastiaan-
item.accessRightsOpen Access-
item.fulltextWith Fulltext-
Appears in Collections:Master theses
Files in This Item:
File Description SizeFormat 
845d31c7-7084-4c8b-a964-d10bad246e52.pdf2.4 MBAdobe PDFView/Open
Show simple item record

Page view(s)

76
checked on Sep 7, 2022

Download(s)

136
checked on Sep 7, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.