Data quality measures based on granular computing for multi-label classification

BELLO GARCIA, Marilyn; NAPOLES RUIZ, Gonzalo; VANHOOF, Koen; Bello, Rafael

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/33621

Full metadata record

DC Field	Value	Language
dc.contributor.author	BELLO GARCIA, Marilyn	-
dc.contributor.author	NAPOLES RUIZ, Gonzalo	-
dc.contributor.author	VANHOOF, Koen	-
dc.contributor.author	Bello, Rafael	-
dc.date.accessioned	2021-03-03T09:08:37Z	-
dc.date.available	2021-03-03T09:08:37Z	-
dc.date.issued	2021	-
dc.date.submitted	2021-03-02T21:54:24Z	-
dc.identifier.citation	Information Sciences, 560 , p. 51 -67	-
dc.identifier.issn	0020-0255	-
dc.identifier.uri	http://hdl.handle.net/1942/33621	-
dc.description.abstract	Rough set theory is a granular computing formalism that allows analyzing a given dataset through well-defined measures. Some of these measures aim to characterize datasets used to discover knowledge, mostly in traditional classification problems. Measuring the data quality is pivotal to estimate beforehand the problem's difficulty since a classification mod-el's accuracy heavily depends on the data quality. However, to the best of our knowledge, there are no measures devoted to analyzing the quality of multi-label datasets. In this paper, we propose six data quality measures for multi-label problems, which are based on different granular approaches. Some of these measures redefine the decision class concept , while others redefine the consistency concept. Moreover, we study the impact of the similarity threshold parameters and the distance functions on the behavior of these measures. The numerical simulations show a statistical correlation between the measures that redefine the consistency concept and the performance of the ML-kNN classifier.	-
dc.description.sponsorship	The authors would like to thank the anonymous reviewers for their valuable and constructive feedback.	-
dc.language.iso	en	-
dc.publisher	ELSEVIER SCIENCE INC	-
dc.rights	2021 Elsevier Inc. All rights reserved.	-
dc.subject.other	Multi-label classification	-
dc.subject.other	Granular computing	-
dc.subject.other	Rough set theory	-
dc.subject.other	Data quality measures	-
dc.title	Data quality measures based on granular computing for multi-label classification	-
dc.type	Journal Contribution	-
dc.identifier.epage	67	-
dc.identifier.spage	51	-
dc.identifier.volume	560	-
local.format.pages	17	-
local.bibliographicCitation.jcat	A1	-
local.publisher.place	STE 800, 230 PARK AVE, NEW YORK, NY 10169 USA	-
local.type.refereed	Refereed	-
local.type.specified	Article	-
dc.identifier.doi	10.1016/j.ins.2021.01.027	-
dc.identifier.isi	WOS:000670877900004	-
dc.identifier.eissn	1872-6291	-
local.provider.type	CrossRef	-
local.uhasselt.uhpub	yes	-
local.uhasselt.international	yes	-
item.validation	ecoom 2022	-
item.contributor	BELLO GARCIA, Marilyn	-
item.contributor	NAPOLES RUIZ, Gonzalo	-
item.contributor	VANHOOF, Koen	-
item.contributor	Bello, Rafael	-
item.accessRights	Open Access	-
item.fulltext	With Fulltext	-
item.fullcitation	BELLO GARCIA, Marilyn; NAPOLES RUIZ, Gonzalo; VANHOOF, Koen & Bello, Rafael (2021) Data quality measures based on granular computing for multi-label classification. In: Information Sciences, 560 , p. 51 -67.	-
crisitem.journal.issn	0020-0255	-
crisitem.journal.eissn	1872-6291	-
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
1-s2.0-S0020025521000542-main.pdf Restricted Access	Published version	741.26 kB	Adobe PDF	View/Open Request a copy
manuscript.pdf	Peer-reviewed author version	752.99 kB	Adobe PDF	View/Open

Show simple item record

SCOPUS^TM
Citations

30

checked on Jun 3, 2026

WEB OF SCIENCE^TM
Citations

23

checked on Jun 10, 2026

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM