Enriched-data problems and essential non-identifiability

MOLENBERGHS, Geert; NJAGI, Edmund; Kenward, Michael G.; VERBEKE, Geert

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/14676

Title:	Enriched-data problems and essential non-identifiability
Authors:	MOLENBERGHS, Geert NJAGI, Edmund Kenward, Michael G. VERBEKE, Geert
Issue Date:	2012
Source:	International Journal of Statistics in Medical Research, 1 (1), p. 16-44
Abstract:	There are two principal ways in which statistical models extend beyond the data available.First, the data may be coarsened, that is, what is actually observed is less detailed than whatis planned, owing to, for example, attrition, censoring, grouping, or a combination of these.Second, the data may be augmented, that is, the observed data are hypothetically but conveniently supplemented with structures such as random effects, latent variables, latent classes, or component membership in mixture distributions. These two settings together will be referred to as enriched data. Reasons for modelling enriched data include the incorporation of substantive information, such as the need for predictions, advantages in interpretation, and mathematical and computational convenience. The fitting of models for enriched data combine evidence arising from empirical data with non-verifiable model components, i.e., that are purely assumption driven. This has important implications for the interpretation of statistical analyses in such settings.While widely known, the exploration and discussion of these issues is somewhat scattered.Here,we provide a unified framework for enriched data and show in general that to any given model an entire class of models can be assigned, with all of its members producing the same fit to the observed data but arbitrary regarding the unobservable parts of the enriched data. The implications of this are explored for several specific settings, namely that of latent classes, finite mixtures, factor analysis, random-effects models, and incomplete data. The results are applied to a range of relevant examples.
Keywords:	compound-symmetry; empirical bayes; enriched data; exponential random effects; gamma random effects; linear mixed model; missing at random; missing completely at random; non-future dependence; pattern-mixture model; selection model; shared-parameter model
Document URI:	http://hdl.handle.net/1942/14676
ISSN:	1929-6029
DOI:	10.6000/1929-6029.2012.01.01.02
Rights:	© 2012 Lifescience Global
Category:	A2
Type:	Journal Contribution
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
enrichment16.pdf	Peer-reviewed author version	582.07 kB	Adobe PDF	View/Open
IJSMRV1N1A02-Molenberghs.pdf	Published version	4.51 MB	Adobe PDF	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM