Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/33621
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBELLO GARCIA, Marilyn-
dc.contributor.authorNAPOLES RUIZ, Gonzalo-
dc.contributor.authorVANHOOF, Koen-
dc.contributor.authorBello, Rafael-
dc.date.accessioned2021-03-03T09:08:37Z-
dc.date.available2021-03-03T09:08:37Z-
dc.date.issued2021-
dc.date.submitted2021-03-02T21:54:24Z-
dc.identifier.citationINFORMATION SCIENCES, 560 , p. 51 -67-
dc.identifier.issn0020-0255-
dc.identifier.urihttp://hdl.handle.net/1942/33621-
dc.description.abstractRough set theory is a granular computing formalism that allows analyzing a given dataset through well-defined measures. Some of these measures aim to characterize datasets used to discover knowledge, mostly in traditional classification problems. Measuring the data quality is pivotal to estimate beforehand the problem's difficulty since a classification mod-el's accuracy heavily depends on the data quality. However, to the best of our knowledge, there are no measures devoted to analyzing the quality of multi-label datasets. In this paper, we propose six data quality measures for multi-label problems, which are based on different granular approaches. Some of these measures redefine the decision class concept , while others redefine the consistency concept. Moreover, we study the impact of the similarity threshold parameters and the distance functions on the behavior of these measures. The numerical simulations show a statistical correlation between the measures that redefine the consistency concept and the performance of the ML-kNN classifier.-
dc.description.sponsorshipThe authors would like to thank the anonymous reviewers for their valuable and constructive feedback.-
dc.language.isoen-
dc.publisherELSEVIER SCIENCE INC-
dc.rights2021 Elsevier Inc. All rights reserved.-
dc.subject.otherMulti-label classification-
dc.subject.otherGranular computing-
dc.subject.otherRough set theory-
dc.subject.otherData quality measures-
dc.titleData quality measures based on granular computing for multi-label classification-
dc.typeJournal Contribution-
dc.identifier.epage67-
dc.identifier.spage51-
dc.identifier.volume560-
local.bibliographicCitation.jcatA1-
local.publisher.placeSTE 800, 230 PARK AVE, NEW YORK, NY 10169 USA-
local.type.refereedRefereed-
local.type.specifiedArticle-
dc.identifier.doi10.1016/j.ins.2021.01.027-
dc.identifier.isiWOS:000670877900004-
dc.identifier.eissn1872-6291-
local.provider.typeCrossRef-
local.uhasselt.uhpubyes-
local.uhasselt.internationalyes-
item.contributorBELLO GARCIA, Marilyn-
item.contributorNAPOLES RUIZ, Gonzalo-
item.contributorVANHOOF, Koen-
item.contributorBello, Rafael-
item.fullcitationBELLO GARCIA, Marilyn; NAPOLES RUIZ, Gonzalo; VANHOOF, Koen & Bello, Rafael (2021) Data quality measures based on granular computing for multi-label classification. In: INFORMATION SCIENCES, 560 , p. 51 -67.-
item.accessRightsOpen Access-
item.fulltextWith Fulltext-
item.validationecoom 2022-
crisitem.journal.issn0020-0255-
crisitem.journal.eissn1872-6291-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
1-s2.0-S0020025521000542-main.pdf
  Restricted Access
Published version741.26 kBAdobe PDFView/Open    Request a copy
manuscript.pdfPeer-reviewed author version752.99 kBAdobe PDFView/Open
Show simple item record

WEB OF SCIENCETM
Citations

18
checked on May 2, 2024

Page view(s)

60
checked on Sep 5, 2022

Download(s)

4
checked on Sep 5, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.