Dealing with missing data in cross sectional data on transport

RUMISHA, Susan Fred

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/3695

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	HENS, N.	-
dc.contributor.author	RUMISHA, Susan Fred	-
dc.date.accessioned	2007-11-29T12:35:14Z	-
dc.date.available	2007-11-29T12:35:14Z	-
dc.date.issued	2007	-
dc.identifier.uri	http://hdl.handle.net/1942/3695	-
dc.description.abstract	In sample surveys and most research work non-response is often a major problem, this means, sometimes the required data are not obtained for all elements that are selected for observation, and this leads to missing data. Missingness can occur in cross-sectional, longitudinal or multivariate studies. Different imputation methods are available and have been used to fill-in the missing data (either response or covariates) and the produced data is expected, under certain conditions, to lead to valid inference. This study explores efficiency of several imputation methods in cross-sectional data, including parametric and nonparametric, in estimating the effect of covariates in linear models. Simple and advanced imputation methods, such as multiple imputations were considered. Since our data was from a cross-sectional study, univariate patterns and behaviors of missingness were used. Two main scenarios were considered, including a case where the missingness is in the response variable and when the missingness occurs in the covariate. An approach followed was that, a new data was generated, missingness was invocated using different types of missingness models depending on the assumed mechanism, and then imputation was employed to the missing values. Assessment of the accuracy was done by comparing results with the true estimates, which were obtained from original generated data. The focus was in the regression model parameters estimates (with their SE) and the variability introduced in the response values. To evaluate the efficiency of methods and variability of parameters of interest, simulation studies were done. With the runs obtains, MASE values were calculated for each method and compared. Parametric methods for imputation were found to be not adequate, especially when the missing proportion in the response is high. Results from nonparametric methods were good despite slight over or underestimation of the variability in the data. For the case of missingness in the covariate, unbiased results were obtained under MCAR and MAR and biased results under MNAR. However, in this case, single parametric methods seem to perform better than multiple imputation methods or nonparametric ones. It was observed that missingness mechanism could be influenced by the magnitude of the effect of covariate in the fitted model or in the missingness model involved. In other words, one can say that, the strength of the relationship between covariates and the response variable plays a role in manipulating the missingness mechanism. These results were observed using simple exploration hence more research is needed to provide more support.	-
dc.language.iso	en	-
dc.title	Dealing with missing data in cross sectional data on transport	-
dc.type	Theses and Dissertations	-
local.format.pages	87	-
local.bibliographicCitation.jcat	T2	-
dc.description.notes	Master in Biostatistics	-
local.type.specified	Master thesis	-
dc.bibliographicCitation.oldjcat		-
item.contributor	RUMISHA, Susan Fred	-
item.fulltext	With Fulltext	-
item.accessRights	Open Access	-
item.fullcitation	RUMISHA, Susan Fred (2007) Dealing with missing data in cross sectional data on transport.	-
Appears in Collections:	Applied Statistics: Master theses

Files in This Item:

File	Description	Size	Format
rumisha.pdf		4.3 MB	Adobe PDF	View/Open

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM