Data Representativeness: Issues and Solutions

MILANZI, Elasma; NJAGI, Edmund; BRUCKERS, Liesbeth; MOLENBERGHS, Geert

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/18626

Title:	Data Representativeness: Issues and Solutions
Authors:	MILANZI, Elasma NJAGI, Edmund BRUCKERS, Liesbeth MOLENBERGHS, Geert
Issue Date:	2015
Source:	EFSA Supporting Publications 12(2), (ART N° 759E)
Series/Report:	EFSA supporting publication
Abstract:	In its control programmes on maximum residue level compliance and exposure assessments, EFSA requires the participating countries to submit results, from specific numbers of food item samples, analyzed in the countries. These data are used to obtain estimates such as the proportion of samples exceeding the maximum residue limits, and the mean and maximum residue concentration per food item to assess exposure. An important consideration is the design and analysis of the programmes. In this report, we combine elements of survey sampling methodology, and statistical modeling, as a benchmark framework for the programmes, starting from the translation of research questions into statistical problems, to the statistical analysis and interpretation. Particular focus is placed on the issues that could affect the representativeness of the data, and remedial procedures are proposed. For example, in the absence of information on the sampling design, a sensitivity analysis, across a range of designs, is proposed. On the other hand, weighted generalized linear mixed models, and generalized linear mixed models combining both conjugate and normal random effects, are proposed, to address selection bias. Likelihood-based analysis methods are also proposed to address missing and censored data problems. Suggestions for improvements in the design and analysis of the programmes are also identified and discussed. For instance, incorporation of stratified sampling methodology, in determining both the total number, and the allocation of samples to the participating countries, is proposed. All through the report, statistical analysis models which properly take into account the hierarchical (and thus correlated) structure in which the data are collected are proposed.
Keywords:	bias; censoring; clustering; likelihood; missing data; stratification; linear mixed models; generalized linear mixed models
Document URI:	http://hdl.handle.net/1942/18626
Link to publication/dataset:	http://www.efsa.europa.eu/fr/supporting/doc/759e.pdf
ISSN:	2397-8325
DOI:	10.2903/sp.efsa.2015.EN-759
Rights:	© Interuniversity Institute for Biostatistics and statistical Bioinformatics, 2015.
Category:	A2
Type:	Journal Contribution
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
759e.pdf	Published version	3.28 MB	Adobe PDF	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM