Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/4342
Title: | Model selection in a regression setting with missing covariate data | Authors: | HENS, Niel AERTS, Marc MOLENBERGHS, Geert |
Issue Date: | 2004 | Source: | ASA Proceedings of the Joint Statistical Meetings. p. 190-.... | Abstract: | In a regression setting, the Akaiki Information Criterion, AIC, is one of the most frequently used methods to select an optimal regression model. In case the missingness probability of the partially missing covariates depends on the response, regression estimates based on the complete cases are known to be biased. The use of selection criteria in the presence of missing covariate data can lead to poor and wrong models. We introduce a weighted version of AIC in analogy with the weighted Horvitz-Thompson estimates. The weights are thus proportional to the inverse of the missingness probabilities. In some settings these probabilities are known, in other settings we propose to use semiparametric estimates. Next to theoretical properties, an extensive simulation study and several data examples show that the weighted AIC criterion provides better model choices. This modification can be seen as an implicit imputation of missing observations. The extension towards the Bayesian Information Criterion, Mallows' Cp and other criteria is straightforward. | Keywords: | Akaiki information criterion; missing data; model selection; weights | Document URI: | http://hdl.handle.net/1942/4342 | Category: | C2 | Type: | Proceedings Paper |
Appears in Collections: | Research publications |
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.