Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/41479
Title: | A longitudinal transition imputation model for categorical data applied to a large registry dataset | Authors: | Mamouris, Pavlos NASSIRI, Vahid VERBEKE, Geert JANSSENS, Arne Vaes, Bert MOLENBERGHS, Geert |
Issue Date: | 2023 | Publisher: | WILEY | Source: | STATISTICS IN MEDICINE, 42 (29), p. 5405-5418 | Abstract: | Imputation of longitudinal categorical covariates with several waves and many predictors is cumbersome in terms of implausible transitions, colinearity, and overfitting. We designed a simulation study with data obtained from a general practitioners' morbidity registry in Belgium for three waves, with smoking as the longitudinal covariate of interest. We set varying proportions of data on smoking to missing completely at random and missing not at random with proportions of missingness equal to 10%, 30%, 50%, and 70%. This study proposed a 3-stage approach that allows flexibility when imputing time-dependent categorical covariates. First, multiple imputation using fully conditional specification or multiple imputation for the predictor variables was deployed using the wide format such that previous and future information of the same patient was utilized. Second, a joint Markov transition model for initial, forward, backward, and intermittent probabilities was developed for each imputed dataset. Finally, this transition model was used for imputation. We compared the performance of this methodology with an analyses of the complete data and with listwise deletion in terms of bias and root mean square error. Next, we applied this methodology in a clinical case for years 2017 to 2021, where we estimated the effect of several covariates on the pneumococcal vaccination. This methodological framework ensures that the plausibility of transitions is preserved, overfitting and colinearity issues are resolved, and confounders can be utilized. Finally, a companion R package was developed to enable the replication and easy application of this methodology. | Notes: | Mamouris, P (corresponding author), Katholieke Univ Leuven, Dept Publ Hlth & Primary Care, Kapucijnenvoer 33,H Bldg, B-3000 Leuven, Belgium. pavlos.mamouris@kuleuven.be |
Keywords: | multiple imputation;registry data;smoking outcome;transition probabilities | Document URI: | http://hdl.handle.net/1942/41479 | ISSN: | 0277-6715 | e-ISSN: | 1097-0258 | DOI: | 10.1002/sim.9919 | ISI #: | 001070790300001 | Rights: | 2023 John Wiley & Sons Ltd | Category: | A1 | Type: | Journal Contribution |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ACFrOgBHS2BMM8Et.pdf | Peer-reviewed author version | 852.66 kB | Adobe PDF | View/Open |
A longitudinal transition imputation model for categorical data applied to a large registry dataset.pdf Restricted Access | Published version | 2.82 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.