Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/16968
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | De Beuf, Kristof | - |
dc.contributor.author | De Schrijver, Joachim | - |
dc.contributor.author | THAS, Olivier | - |
dc.contributor.author | Van Criekinge, Wim | - |
dc.contributor.author | Irizarry, Rafael A. | - |
dc.contributor.author | CLEMENT, Lieven | - |
dc.date.accessioned | 2014-07-18T10:53:02Z | - |
dc.date.available | 2014-07-18T10:53:02Z | - |
dc.date.issued | 2012 | - |
dc.identifier.citation | BMC BIOINFORMATICS, 13 | - |
dc.identifier.issn | 1471-2105 | - |
dc.identifier.uri | http://hdl.handle.net/1942/16968 | - |
dc.description.abstract | Background: 454 pyrosequencing is a commonly used massively parallel DNA sequencing technology with a wide variety of application fields such as epigenetics, metagenomics and transcriptomics. A well-known problem of this platform is its sensitivity to base-calling insertion and deletion errors, particularly in the presence of long homopolymers. In addition, the base-call quality scores are not informative with respect to whether an insertion or a deletion error is more likely. Surprisingly, not much effort has been devoted to the development of improved base-calling methods and more intuitive quality scores for this platform. Results: We present HPCall, a 454 base-calling method based on a weighted Hurdle Poisson model. HPCall uses a probabilistic framework to call the homopolymer lengths in the sequence by modeling well-known 454 noise predictors. Base-calling quality is assessed based on estimated probabilities for each homopolymer length, which are easily transformed to useful quality scores. Conclusions: Using a reference data set of the Escherichia coli K-12 strain, we show that HPCall produces superior quality scores that are very informative towards possible insertion and deletion errors, while maintaining a base-calling accuracy that is better than the current one. Given the generality of the framework, HPCall has the potential to also adapt to other homopolymer-sensitive sequencing technologies. | - |
dc.description.sponsorship | IAP research network of the Belgian government (Belgian Science Policy) (grant number P6/03); Ghent University (Multidisciplinary Research Partnership "Bioinformatics: from nucleotides to networks") | - |
dc.language.iso | en | - |
dc.publisher | BIOMED CENTRAL LTD | - |
dc.rights | © 2012 De Beuf et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. | - |
dc.subject.other | biochemical research Methods; biotechnology & applied microbiology; mathematical & computational biology | - |
dc.title | Improved base-calling and quality scores for 454 sequencing based on a Hurdle Poisson model | - |
dc.type | Journal Contribution | - |
dc.identifier.volume | 13 | - |
local.format.pages | 11 | - |
local.format.pages | 11 | - |
local.bibliographicCitation.jcat | A1 | - |
dc.description.notes | [De Beuf, Kristof; De Schrijver, Joachim; Thas, Olivier; Van Criekinge, Wim] Univ Ghent, Dept Math Modelling Stat & Bioinformat, B-9000 Ghent, Belgium. [Thas, Olivier] Univ Wollongong, Sch Math & Appl Stat, Ctr Stat & Survey Methodol, Wollongong, NSW 2522, Australia. [Irizarry, Rafael A.] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD USA. [Clement, Lieven] Univ Ghent, Dept Appl Math & Comp Sci, B-9000 Ghent, Belgium. [Clement, Lieven] Katholieke Univ Leuven, Interuniv Inst Biostat & Stat Bioinformat, B-3000 Louvain, Belgium. [Clement, Lieven] Univ Hasselt, B-3000 Louvain, Belgium. | - |
local.publisher.place | LONDON | - |
local.type.refereed | Refereed | - |
local.type.specified | Article | - |
dc.identifier.doi | 10.1186/1471-2105-13-303 | - |
dc.identifier.isi | 000312894900001 | - |
item.fulltext | With Fulltext | - |
item.contributor | De Beuf, Kristof | - |
item.contributor | De Schrijver, Joachim | - |
item.contributor | THAS, Olivier | - |
item.contributor | Van Criekinge, Wim | - |
item.contributor | Irizarry, Rafael A. | - |
item.contributor | CLEMENT, Lieven | - |
item.fullcitation | De Beuf, Kristof; De Schrijver, Joachim; THAS, Olivier; Van Criekinge, Wim; Irizarry, Rafael A. & CLEMENT, Lieven (2012) Improved base-calling and quality scores for 454 sequencing based on a Hurdle Poisson model. In: BMC BIOINFORMATICS, 13. | - |
item.accessRights | Closed Access | - |
item.validation | ecoom 2014 | - |
crisitem.journal.issn | 1471-2105 | - |
crisitem.journal.eissn | 1471-2105 | - |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
1471-2105-13-303.pdf | 487.58 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.