Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/40374
Title: Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data
Authors: Izci, Hava
Macq, Gilles
Tambuyzer, Tim
De Schutter, Harlinde
Wildiers, Hans
Duhoux, Francois P.
de Azambuja, Evandro
Taylor, Donatienne
Staelens, Gracienne
ORYE, Guy 
Hlavata, Zuzana
Hellemans , Helga
De Rop, Carine
Neven, Patrick
Verdoodt, Freija
Issue Date: 2023
Publisher: DOVE MEDICAL PRESS LTD
Source: Clinical Epidemiology, 15 , p. 559 -568
Abstract: Purpose: High-quality population-based cancer recurrence data are scarcely available, mainly due to complexity and cost of registration. For the first time in Belgium, we developed a tool to estimate distant recurrence after a breast cancer diagnosis at the population level, based on real-world cancer registration and administrative data.Methods: Data on distant cancer recurrence (including progression) from patients diagnosed with breast cancer between 2009-2014 were collected from medical files at 9 Belgian centers to train, test and externally validate an algorithm (i.e., gold standard). Distant recurrence was defined as the occurrence of distant metastases between 120 days and within 10 years after the primary diagnosis, with follow-up until December 31, 2018. Data from the gold standard were linked to population-based data from the Belgian Cancer Registry (BCR) and administrative data sources. Potential features to detect recurrences in administrative data were defined based on expert opinion from breast oncologists, and subsequently selected using bootstrap aggregation. Based on the selected features, classification and regression tree (CART) analysis was performed to construct an algorithm for classifying patients as having a distant recurrence or not.Results: A total of 2507 patients were included of whom 216 had a distant recurrence in the clinical data set. The performance of the algorithm showed sensitivity of 79.5% (95% CI 68.8-87.8%), positive predictive value (PPV) of 79.5% (95% CI 68.8-87.8%), and accuracy of 96.7% (95% CI 95.4-97.7%). The external validation resulted in a sensitivity of 84.1% (95% CI 74.4-91.3%), PPV of 84.1% (95% CI 74.4-91.3%), and an accuracy of 96.8% (95% CI 95.4-97.9%).Conclusion: Our algorithm detected distant breast cancer recurrences with an overall good accuracy of 96.8% for patients with breast cancer, as observed in the first multi-centric external validation exercise.
Notes: Izci, H (corresponding author), Katholieke Univ Leuven, Dept oncol, Herestr 49 Box 7003-06, B-3000 Leuven, Belgium.
hava.izci@kuleuven.be
Keywords: machine learning;breast cancer;distant metastases;recurrences;algorithm;administrative data
Document URI: http://hdl.handle.net/1942/40374
ISSN: 1179-1349
e-ISSN: 1179-1349
DOI: 10.2147/CLEP.S400071
ISI #: 000987451800001
Rights: 2023 Izci et al. This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php)
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
CLEP_A_400071 559..568.pdfPublished version497.37 kBAdobe PDFView/Open
Show full item record

WEB OF SCIENCETM
Citations

1
checked on Apr 30, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.