Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/17878
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKASSAHUN, Wondwosen-
dc.contributor.authorNEYENS, Thomas-
dc.contributor.authorMOLENBERGHS, Geert-
dc.contributor.authorFAES, Christel-
dc.contributor.authorVERBEKE, Geert-
dc.date.accessioned2014-11-25T10:30:32Z-
dc.date.available2014-11-25T10:30:32Z-
dc.date.issued2014-
dc.identifier.citationSTATISTICS IN MEDICINE, 33 (25), p. 4402-4419-
dc.identifier.issn0277-6715-
dc.identifier.urihttp://hdl.handle.net/1942/17878-
dc.description.abstractCount data are collected repeatedly over time in many applications, such as biology, epidemiology, and public health. Such data are often characterized by the following three features. First, correlation due to the repeated measures is usually accounted for using subject-specific random effects, which are assumed to be normally distributed. Second, the sample variance may exceed the mean, and hence, the theoretical mean-variance relationship is violated, leading to overdispersion. This is usually allowed for based on a hierarchical approach, combining a Poisson model with gamma distributed random effects. Third, an excess of zeros beyond what standard count distributions can predict is often handled by either the hurdle or the zero-inflated model. A zero-inflated model assumes two processes as sources of zeros and combines a count distribution with a discrete point mass as a mixture, while the hurdle model separately handles zero observations and positive counts, where then a truncated-at-zero count distribution is used for the non-zero state. In practice, however, all these three features can appear simultaneously. Hence, a modeling framework that incorporates all three is necessary, and this presents challenges for the data analysis. Such models, when conditionally specified, will naturally have a subject-specific interpretation. However, adopting their purposefully modified marginalized versions leads to a direct marginal or population-averaged interpretation for parameter estimates of covariate effects, which is the primary interest in many applications. In this paper, we present a marginalized hurdle model and a marginalized zero-inflated model for correlated and overdispersed count data with excess zero observations and then illustrate these further with two case studies. The first dataset focuses on the Anopheles mosquito density around a hydroelectric dam, while adolescents' involvement in work, to earn money and support their families or themselves, is studied in the second example. Sub-models, which result from omitting zero-inflation and/or overdispersion features, are also considered for comparison's purpose. Analysis of the two datasets showed that accounting for the correlation, overdispersion, and excess zeros simultaneously resulted in a better fit to the data and, more importantly, that omission of any of them leads to incorrect marginal inference and erroneous conclusions about covariate effects. Copyright (c) 2014 John Wiley & Sons, Ltd.-
dc.description.sponsorshipWHO/TDR [A50881]; Institutional University Cooperation of the Council of Flemish Universities (VLIR-IUC); IAP research Network P7/06 of the Belgian Government (Belgian Science Policy)-
dc.language.isoen-
dc.publisherWILEY-BLACKWELL-
dc.rightsCopyright © 2014 John Wiley & Sons, Ltd.-
dc.subject.otherclustering; hurdle model; marginal model; multilevel model; overdispersion; zero-inflated model-
dc.subject.otherclustering; hurdle model; marginal model; multilevel model; overdispersion; zero-inflated model-
dc.titleMarginalized multilevel hurdle and zero-inflated models for overdispersed and correlated count data with excess zeros-
dc.typeJournal Contribution-
dc.identifier.epage4419-
dc.identifier.issue25-
dc.identifier.spage4402-
dc.identifier.volume33-
local.format.pages18-
local.bibliographicCitation.jcatA1-
dc.description.notes[Kassahun, Wondwosen] Jimma Univ, Dept Epidemiol & Biostat, Jimma, Ethiopia. [Neyens, Thomas; Molenberghs, Geert; Faes, Christel; Verbeke, Geert] Univ Hasselt, B-3590 Diepenbeek, Belgium. [Molenberghs, Geert; Verbeke, Geert] Katholieke Univ Leuven, B-3000 Leuven, Belgium.-
local.publisher.placeHOBOKEN-
local.type.refereedRefereed-
local.type.specifiedArticle-
dc.identifier.doi10.1002/sim.6237-
dc.identifier.isi000342898500006-
item.accessRightsRestricted Access-
item.validationecoom 2015-
item.fulltextWith Fulltext-
item.fullcitationKASSAHUN, Wondwosen; NEYENS, Thomas; MOLENBERGHS, Geert; FAES, Christel & VERBEKE, Geert (2014) Marginalized multilevel hurdle and zero-inflated models for overdispersed and correlated count data with excess zeros. In: STATISTICS IN MEDICINE, 33 (25), p. 4402-4419.-
item.contributorKASSAHUN, Wondwosen-
item.contributorNEYENS, Thomas-
item.contributorMOLENBERGHS, Geert-
item.contributorFAES, Christel-
item.contributorVERBEKE, Geert-
crisitem.journal.issn0277-6715-
crisitem.journal.eissn1097-0258-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
kassahun 1.pdf
  Restricted Access
Published version204.48 kBAdobe PDFView/Open    Request a copy
Show simple item record

SCOPUSTM   
Citations

21
checked on Sep 3, 2020

WEB OF SCIENCETM
Citations

27
checked on Apr 22, 2024

Page view(s)

62
checked on Sep 7, 2022

Download(s)

46
checked on Sep 7, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.