Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/47163
Title: NLP-Based Hospital Diagnosis Reporting Aid
Authors: Nijssen, Gwendoline 
Advisors: NEVEN, Frank
VANDEVOORT, Brecht
Issue Date: 2025
Publisher: tUL
Abstract: This thesis explores using NLP to automate hospital diagnosis reporting for Atrial Fibrillation patients following European Society of Cardiology guidelines. The two-phase study used English MIMIC-IV dataset (40,000+ records) and Dutch Jessa Hospital data (12,516 records) to develop AF classification and CHA2DS2-VASc score extraction models. For AF classification, XGBoost with TF-IDF achieved 95% accuracy on English cardiology data and 93% on Dutch data. Enhanced n-gram approaches improved Dutch performance from 0.71 to 0.78 F1-score by capturing negation patterns. For score extraction, fine-tuned MedRoBERTa.nl achieved 95% accuracy and 0.82 macro F1-score, excelling at identifying missing scores (0.967 F1-score). Quality analysis revealed documentation gaps: only 44.5% of 1,489 AF patients had documented CHA2DS2-VASc scores, with 26.7% having clinically significant scores requiring anticoagulation consideration. The research demonstrates feasibility of automated clinical text processing, combining traditional ML for classification with transformers for extraction. Results exceeded hypothesized thresholds (90% for classification, 85% for extraction) while highlighting cross-language processing challenges and class imbalance issues. Future work includes multi-label AF classification, improved embeddings, and continuing research on the automated reporting of other quality indicators for patients with AF.
Notes: master in de informatica
Document URI: http://hdl.handle.net/1942/47163
Category: T2
Type: Theses and Dissertations
Appears in Collections:Master theses

Files in This Item:
File Description SizeFormat 
cc6d1264-19a9-4557-81b8-f28338c0f32d.pdf1.3 MBAdobe PDFView/Open
Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.