Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/41479
Title: A longitudinal transition imputation model for categorical data applied to a large registry dataset
Authors: Mamouris, Pavlos
NASSIRI, Vahid 
VERBEKE, Geert 
JANSSENS, Arne 
Vaes, Bert
MOLENBERGHS, Geert 
Issue Date: 2023
Publisher: WILEY
Source: STATISTICS IN MEDICINE, 42 (29), p. 5405-5418
Abstract: Imputation of longitudinal categorical covariates with several waves and many predictors is cumbersome in terms of implausible transitions, colinearity, and overfitting. We designed a simulation study with data obtained from a general practitioners' morbidity registry in Belgium for three waves, with smoking as the longitudinal covariate of interest. We set varying proportions of data on smoking to missing completely at random and missing not at random with proportions of missingness equal to 10%, 30%, 50%, and 70%. This study proposed a 3-stage approach that allows flexibility when imputing time-dependent categorical covariates. First, multiple imputation using fully conditional specification or multiple imputation for the predictor variables was deployed using the wide format such that previous and future information of the same patient was utilized. Second, a joint Markov transition model for initial, forward, backward, and intermittent probabilities was developed for each imputed dataset. Finally, this transition model was used for imputation. We compared the performance of this methodology with an analyses of the complete data and with listwise deletion in terms of bias and root mean square error. Next, we applied this methodology in a clinical case for years 2017 to 2021, where we estimated the effect of several covariates on the pneumococcal vaccination. This methodological framework ensures that the plausibility of transitions is preserved, overfitting and colinearity issues are resolved, and confounders can be utilized. Finally, a companion R package was developed to enable the replication and easy application of this methodology.
Notes: Mamouris, P (corresponding author), Katholieke Univ Leuven, Dept Publ Hlth & Primary Care, Kapucijnenvoer 33,H Bldg, B-3000 Leuven, Belgium.
pavlos.mamouris@kuleuven.be
Keywords: multiple imputation;registry data;smoking outcome;transition probabilities
Document URI: http://hdl.handle.net/1942/41479
ISSN: 0277-6715
e-ISSN: 1097-0258
DOI: 10.1002/sim.9919
ISI #: 001070790300001
Rights: 2023 John Wiley & Sons Ltd
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
ACFrOgBHS2BMM8Et.pdf
  Until 2024-09-30
Peer-reviewed author version852.66 kBAdobe PDFView/Open    Request a copy
A longitudinal transition imputation model for categorical data applied to a large registry dataset.pdf
  Restricted Access
Published version2.82 MBAdobe PDFView/Open    Request a copy
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.