Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/42825
Title: Juegos del español – Iterative Design, Evaluation and Implementation of Games with a Purpose to Enhance Parts-of-Speech Tagging in a Corpus of European Spanish Dialects
Authors: SEGUNDO DIAZ, Rosa Lilia 
Advisors: Coninx, Karin
Hoste, Veronique
Rovelo Ruiz, Gustavo
Bouzouita, Miriam
Issue Date: 2024
Abstract: Many discoveries in the context of linguistics are made through the analysis of very extensive collections of data. These collections are used for Natural Language Processing (NLP), which is one of the most thriving data science branches today, with applications such as machine translation, information retrieval, speech recognition, speech synthesis, computer-aided language learning, communication support for the visually impaired, and many others. Most advanced approaches in this domain are data-driven and, as such, rely on large amounts of data on which to train NLP tools. However, these tools perform poorly when languages suffer from data scarcity, e.g., low-resourced or undocumented languages, which undermines their study. The creation of language resources is often done manually by expert annotators, which can be expensive and time-consuming. Therefore, researchers have explored alternative approaches to expert annotations, such as non-expert collaboration through micro-payments or other incentives like enjoyment. Games With a Purpose (GWAPs) have explored the incentive of enjoyment to engage people in annotation tasks in different languages , such as English and French, with encouraging results. This doctoral research explored GWAPs to build an annotated corpus of European Spanish dialects. To this end, a series of research activities were conducted within a multidisciplinary and user-centred design approach that involved experts in Linguistics , Dialectology and HCI to design, evaluate and implement this game-based approach. These GWAPs involved non-expert collaborative annotators to correct and verify automatic Parts-of-Speech (PoS) tags. To ground the game design, a systematic literature review was conducted to identify research efforts that have positively influenced player enjoyment. Those efforts were found in the form of Game Design Elements (GDEs) that provided the tools to design three GWAPs, i.e., Agentes, Tesoros and Anotatlón. The iterative process of game development involved design and evaluations, bringing together diverse perspectives from a multidisciplinary team and incorporating feedback from players. This approach fostered a deeper understanding of the underlying problem and facilitated informed decision-making to create well-designed GWAPs. The GWAPs, far from mere proof of concept, transitioned into fully operational tools in the last iterations , where both player enjoyment and the data collected were evaluated. This dissertation provides evidence of the potential benefits of using gamified approaches to verify the PoS in spoken dialectal Spanish. Although the large-scale annotation experienced in other linguistic domains and other GWAPs could not be observed, the evaluations indicated that non-experts were able to correct and verify tokens by playing the GWAPs. Additionally, through multiple iterations, it was found that the GWAPs were improved, and the GDEs effectively enhanced player enjoyment. Further, the analysis of the collected data also provided insights into the data quality and factors affecting human annotators' performance in PoS tagging tasks. However, further research is still needed to refine the methodology and address the challenges associated with initial engagement and long-term retention.
Document URI: http://hdl.handle.net/1942/42825
Category: T1
Type: Theses and Dissertations
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
Binder1.pdf
  Until 2029-04-29
12.08 MBAdobe PDFView/Open    Request a copy
Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.