Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/10615
Title: Informative or Noninformative Calls for Gene Expression: A Latent Variable Approach
Authors: KASIM, Adetayo 
LIN, Dan 
VAN SANDEN, Suzy 
Clevert, Djork-Arne
BIJNENS, Luc 
Goehlmann, Hinrich
AMARATUNGA, Dhammika 
Hochreiter, Sepp
SHKEDY, Ziv 
TALLOEN, Willem 
Issue Date: 2010
Publisher: BERKELEY ELECTRONIC PRESS
Source: STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 9(1)
Abstract: The strength and weakness of microarray technology can be attributed to the enormous amount of information it is generating. To fully enhance the benefit of microarray technology for testing differentially expressed genes and classification, there is a need to minimize the amount of irrelevant genes present in microarray data. A major interest is to use probe-level data to call genes informative or noninformative based on the trade-off between the array-to-array variability and the measurement error. Existing works in this direction include filtering likely uninformative sets of hybridization (FLUSH; Calza et al., 2007) and I/NI calls for the exclusion of noninformative genes using FARMS (I/NI calls; Talloen et al., 2007; Hochreiter et al., 2006). In this paper, we propose a linear mixed model as a more flexible method that performs equally good as I/NI calls and outperforms FLUSH. We also introduce other criteria for gene filtering, such as, R2 and intra-cluster correlation. Additionally, we include some objective criteria based on likelihood ratio testing, the Akaike information criteria (AIC; Akaike, 1973) and the Bayesian information criterion (BIC; Schwarz, 1978). Based on the HGU-133A Spiked-in data set, it is shown that the linear mixed model approach outperforms FLUSH, a method that filters genes based on a quantile regression. The linear model is equivalent to a factor analysis model when either the factor loadings are set to a constant with the variance of the latent factor equal to one, or if the factor loadings are set to one together with unconstrained variance of the latent factor. Filtering based on conditional variance calls a probe set informative when the intensity of one or more probes is consistent across the arrays, while filtering using R2 or intra-cluster correlation calls a probe set informative only when average intensity of a probe set is consistent across the arrays. Filtering based on likelihood ratio test AIC and BIC are less stringent compared to the other criteria.
Notes: [Kasim, Adetayo; Lin, Dan; Van Sanden, Suzy; Shkedy, Ziv] Univ Hasselt, Diepenbeek, Belgium. [Kasim, Adetayo; Lin, Dan; Van Sanden, Suzy; Shkedy, Ziv] Katholieke Univ Leuven, Louvain, Belgium. [Clevert, Djork-Arne; Hochreiter, Sepp] Johannes Kepler Univ Linz, Linz, Austria. [Clevert, Djork-Arne] Charite Univ Med Berlin, Berlin, Germany. [Bijnens, Luc; Goehlmann, Hinrich; Talloen, Willem] Janssen Pharmaceut NV, Beerse, Belgium. [Amaratunga, Dhammika] Johnson & Johnson Pharmaceut Res & Dev, Raritan, NJ USA.
Keywords: gene filtering; factor analysis; linear mixed model;gene filtering; factor analysis; linear mixed model
Document URI: http://hdl.handle.net/1942/10615
ISSN: 2194-6302
e-ISSN: 1544-6115
DOI: 10.2202/1544-6115.1460
ISI #: 000274198200003
Category: A1
Type: Journal Contribution
Validations: ecoom 2011
Appears in Collections:Research publications

Show full item record

SCOPUSTM   
Citations

10
checked on Sep 3, 2020

WEB OF SCIENCETM
Citations

11
checked on May 10, 2024

Page view(s)

68
checked on Aug 1, 2022

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.