Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/16968
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDe Beuf, Kristof-
dc.contributor.authorDe Schrijver, Joachim-
dc.contributor.authorTHAS, Olivier-
dc.contributor.authorVan Criekinge, Wim-
dc.contributor.authorIrizarry, Rafael A.-
dc.contributor.authorCLEMENT, Lieven-
dc.date.accessioned2014-07-18T10:53:02Z-
dc.date.available2014-07-18T10:53:02Z-
dc.date.issued2012-
dc.identifier.citationBMC BIOINFORMATICS, 13-
dc.identifier.issn1471-2105-
dc.identifier.urihttp://hdl.handle.net/1942/16968-
dc.description.abstractBackground: 454 pyrosequencing is a commonly used massively parallel DNA sequencing technology with a wide variety of application fields such as epigenetics, metagenomics and transcriptomics. A well-known problem of this platform is its sensitivity to base-calling insertion and deletion errors, particularly in the presence of long homopolymers. In addition, the base-call quality scores are not informative with respect to whether an insertion or a deletion error is more likely. Surprisingly, not much effort has been devoted to the development of improved base-calling methods and more intuitive quality scores for this platform. Results: We present HPCall, a 454 base-calling method based on a weighted Hurdle Poisson model. HPCall uses a probabilistic framework to call the homopolymer lengths in the sequence by modeling well-known 454 noise predictors. Base-calling quality is assessed based on estimated probabilities for each homopolymer length, which are easily transformed to useful quality scores. Conclusions: Using a reference data set of the Escherichia coli K-12 strain, we show that HPCall produces superior quality scores that are very informative towards possible insertion and deletion errors, while maintaining a base-calling accuracy that is better than the current one. Given the generality of the framework, HPCall has the potential to also adapt to other homopolymer-sensitive sequencing technologies.-
dc.description.sponsorshipIAP research network of the Belgian government (Belgian Science Policy) (grant number P6/03); Ghent University (Multidisciplinary Research Partnership "Bioinformatics: from nucleotides to networks")-
dc.language.isoen-
dc.publisherBIOMED CENTRAL LTD-
dc.rights© 2012 De Beuf et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.-
dc.subject.otherbiochemical research Methods; biotechnology & applied microbiology; mathematical & computational biology-
dc.titleImproved base-calling and quality scores for 454 sequencing based on a Hurdle Poisson model-
dc.typeJournal Contribution-
dc.identifier.volume13-
local.format.pages11-
local.format.pages11-
local.bibliographicCitation.jcatA1-
dc.description.notes[De Beuf, Kristof; De Schrijver, Joachim; Thas, Olivier; Van Criekinge, Wim] Univ Ghent, Dept Math Modelling Stat & Bioinformat, B-9000 Ghent, Belgium. [Thas, Olivier] Univ Wollongong, Sch Math & Appl Stat, Ctr Stat & Survey Methodol, Wollongong, NSW 2522, Australia. [Irizarry, Rafael A.] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD USA. [Clement, Lieven] Univ Ghent, Dept Appl Math & Comp Sci, B-9000 Ghent, Belgium. [Clement, Lieven] Katholieke Univ Leuven, Interuniv Inst Biostat & Stat Bioinformat, B-3000 Louvain, Belgium. [Clement, Lieven] Univ Hasselt, B-3000 Louvain, Belgium.-
local.publisher.placeLONDON-
local.type.refereedRefereed-
local.type.specifiedArticle-
dc.identifier.doi10.1186/1471-2105-13-303-
dc.identifier.isi000312894900001-
item.accessRightsOpen Access-
item.fulltextWith Fulltext-
item.validationecoom 2014-
item.contributorDe Beuf, Kristof-
item.contributorDe Schrijver, Joachim-
item.contributorTHAS, Olivier-
item.contributorVan Criekinge, Wim-
item.contributorIrizarry, Rafael A.-
item.contributorCLEMENT, Lieven-
item.fullcitationDe Beuf, Kristof; De Schrijver, Joachim; THAS, Olivier; Van Criekinge, Wim; Irizarry, Rafael A. & CLEMENT, Lieven (2012) Improved base-calling and quality scores for 454 sequencing based on a Hurdle Poisson model. In: BMC BIOINFORMATICS, 13.-
crisitem.journal.issn1471-2105-
crisitem.journal.eissn1471-2105-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
1471-2105-13-303.pdf487.58 kBAdobe PDFView/Open
Show simple item record

SCOPUSTM   
Citations

12
checked on Sep 2, 2020

WEB OF SCIENCETM
Citations

15
checked on Mar 29, 2024

Page view(s)

88
checked on Apr 26, 2023

Download(s)

130
checked on Apr 26, 2023

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.