Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/32466
Title: A QUALITY MEASURE FOR MULTI-LABEL DATASETS ON THE APACHE SPARK FRAMEWORK
Authors: Sánchez, Ricardo
BELLO GARCIA, Marilyn 
Morell, Carlos
Bello, Rafael
VANHOOF, Koen 
Issue Date: 2019
Source: Proceedings of the 2nd International Conference of Information Processing CIPI - IOTAI 2019,
Abstract: In the last years, the amounts of data have increased considerably and therefore, it is becoming more complex to handle these volumes of information. Measuring the data quality is a pivotal aspect to assess the classifier's discriminatory power as the classifiers accuracy heavily depends on the data used to build the model. Multi-label classification is one specific type of classification problem, which has generated an increasing interest in recent years. However, there are no quality measures for multi-label datasets implemented in cluster computing frameworks to evaluate large datasets. This work aims to implement a measure of data quality for multi-label datasets based on Granular Computing under the Apache Spark framework. As a result, it was possible to calculate the values of the quality measure for the datasets, and even in relatively short times.
Keywords: apache spark;Quality Measure;multi-label classification;Multi-label Classification;Apache Spark;quality measure
Document URI: http://hdl.handle.net/1942/32466
Link to publication/dataset: https://convencion.uclv.cu/event/2nd-international-conference-of-information-processing-cipi-iotai-2019-international-workshop-of-internet-of-things-artificial-intelligence-2019-06-24-2019-06-29-37/track/a-quality-measure-for-multi-label-datasets-on-the-apache-spark-framework-1642
ISBN: 9789593123723
Category: C1
Type: Proceedings Paper
Validations: vabb 2023
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
IoT-AI2019-3pag.pdfPublished version179.94 kBAdobe PDFView/Open
Show full item record

Page view(s)

70
checked on Sep 6, 2022

Download(s)

8
checked on Sep 6, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.