Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/26489
Title: Distributed learning: Developing a predictive model based on data from multiple hospitals without data leaving the hospital - A real life proof of concept
Authors: Jochems, Arthur
Deist, Timo M.
Van Soest, Johan
Eble, Michael
BULENS, Paul 
Coucke, Philippe
Dries, Wim
Lambin, Philippe
Dekker, Andre
Issue Date: 2016
Source: RADIOTHERAPY AND ONCOLOGY, 121(3), p. 459-467
Abstract: Purpose: One of the major hurdles in enabling personalized medicine is obtaining sufficient patient data to feed into predictive models. Combining data originating from multiple hospitals is difficult because of ethical, legal, political, and administrative barriers associated with data sharing. In order to avoid these issues, a distributed learning approach can be used. Distributed learning is defined as learning from data without the data leaving the hospital. Patients and methods: Clinical data from 287 lung cancer patients, treated with curative intent with chemoradiation (CRT) or radiotherapy (RT) alone were collected from and stored in 5 different medical institutes (123 patients at MAASTRO (Netherlands, Dutch), 24 at Jessa (Belgium, Dutch), 34 at Liege (Belgium, Dutch and French), 48 at Aachen (Germany, German) and 58 at Eindhoven (Netherlands, Dutch)). A Bayesian network model is adapted for distributed learning (watch the animation: http://youtu.bei nQpqMIuHyOk). The model predicts dyspnea, which is a common side effect after radiotherapy treatment of lung cancer. Results: We show that it is possible to use the distributed learning approach to train a Bayesian network model on patient data originating from multiple hospitals without these data leaving the individual hospital. The AUC of the model is 0.61 (95%Cl, 0.51-0.70) on a 5-fold cross-validation and ranges from 0.59 to 0.71 on external validation sets. Conclusion: Distributed learning can allow the learning of predictive models on data originating from multiple hospitals while avoiding many of the data sharing barriers. Furthermore, the distributed learning approach can be used to extract and employ knowledge from routine patient data from multiple hospitals while being compliant to the various national and European privacy laws. (C) 2016 The Author(s). Published by Elsevier Ireland Ltd.
Keywords: Bayesian networks; distributed learning; privacy preserving data-mining; dyspnea; machine learning
Document URI: http://hdl.handle.net/1942/26489
ISSN: 0167-8140
e-ISSN: 1879-0887
DOI: 10.1016/j.radonc.2016.10.002
ISI #: 000391905200018
Rights: (C) 2016 The Author(s). Published by Elsevier Ireland Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
JOchems.pdfPublished version821.65 kBAdobe PDFView/Open
Show full item record

SCOPUSTM   
Citations

50
checked on Sep 3, 2020

WEB OF SCIENCETM
Citations

108
checked on Apr 22, 2024

Page view(s)

76
checked on Sep 5, 2022

Download(s)

114
checked on Sep 5, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.