Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/45992
Title: Quantifying and modeling explicit and implicit bias in pattern classification
Authors: KOUTSOVITI-KOUMERI, Lisa 
Advisors: Vanhoof, Koenraad
NĂ¡poles, Gonzalo
Issue Date: 2025
Abstract: Automated decision-making (ADM) systems that classify people based on historical data can produce biased decisions. Examples of automated decisions can be receiving a bank loan, qualifying for medical treatment or social benefits, getting hired or admitted to a university, etc. Each can have a profound impact on someone's future. Therefore, decision-makers are obliged by law to ensure that decisions are fair and unbiased. In the domain of machine learning, bias, also referred to as discrimination or favoritism, is defined as unfair or prejudiced outcomes caused by biased data or a biased classification model with regard to personal characteristics, such as ethnicity or gender, that should not be used to inform the decision at hand. Ensuring that ADM systems treat people fairly is a yet unsolved challenge. ADM tools are nowadays widely used, first, to leverage the growing number of available data and, second, because it is believed that such systems are unbiased as compared to human decision-makers. Sadly, biased beliefs can still permeate the decision-making process in various ways, usually through the data used to train the algorithms. The need to measure bias encoded in tabular data used to solve pattern classification problems is widely recognized by academia, legislators, and enterprises alike. Numerous measures have been proposed since 2010. However, existing measures capture different aspects of bias and, as expected, have several limitations. Consequently, despite the plethora of available approaches, researchers still call for the introduction of novel bias measures to shed light into the complex ways bias is reflected in data patterns. This thesis adds two new fairness measures to the existing arsenal, examines their placement in the machine learning pipeline and contrasts them to existing approaches to examine their similarities, advantages, and limitations. The first proposed measure is based on the fuzzy rough set theory. It captures explicit bias expressed as inconsistencies in decision-making, where similar people received different treatment with regard to a so-called sensitive feature in a dataset, such as race or gender. The second novel bias measure uses a recurrent neural network, a Fuzzy Cognitive Map (FCM), to measure implicit bias expressed as the complex interactions and feedback loops between sensitive and non-sensitive features. The suggested fairness measures have several advantages. They examine bias in the data, whereas the majority of existing approaches measure bias with regard to a classifier. However, classifiers merely replicate patterns in the training data, which can be already inconsistent and reflect societal biases. Moreover, existing approaches focus mainly on quantifying group-based notions of fairness, which require that two groups of people are treated similarly. However, reality is more nuanced and often requires evaluating bias in a case-by-case basis and considering contextual factors. The proposed measures are versatile as they can quantify fairness towards an individual, a group of people and a feature as a whole. Certainly, we do not claim that the proposed measures are better at capturing bias than existing approaches, but rather that they offer new information to further understand how historical inequalities permeate our data. They still suffer from limitations which should be tackled in future work. The fuzzy rough set-based measure is computationally expensive and expresses bias as indiscernibility while ignoring dominance relations. The FCM-based measure is dependent on expert knowledge, which can be subjective and not always available. In addition, the compatibility of the proposed measures with a legal and ethical understandings of fairness and bias should be further developed to ensure that the measures comply with the emergent AI regulations.
Document URI: http://hdl.handle.net/1942/45992
Category: T1
Type: Theses and Dissertations
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
thesis.pdf
  Until 2030-04-26
Published version39.58 MBAdobe PDFView/Open    Request a copy
Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.