Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/35299
Title: Assessing performance of heuristic-based biological alignment algorithms
Authors: Ankunda, Violet
Advisors: VALKENBORG, Dirk
VAN HYFTE, Dirk
Issue Date: 2021
Publisher: tUL
Abstract: Sequence alignment is the process of comparing different sequences by searching for a series of individual characters. Current state-of-the-art biological sequence alignment algorithms such as BLAST relies on heuristics and dynamical programming based on probabilistic models. Moreover, these algorithms perform analysis within a so-called query window defined as the most similar region to that of the query sequence, with a risk of missing homologies outside that window which may possibly remain relevant. The main purpose of this Master Thesis project is to evaluate the performance of BLAST, retrieve homologous sequences of given queries from a set of well known protein sequences, as well as evaluating how different metrics can be used for performance. Lastly, is to use this framework to compare BLAST with other heuristic- based biological alignment algorithms. In this project, the Protein Data Bank (PDB) data was used as the target sequence. The Structural Classification of Proteins-extended which was used to generate the query set with 100 sequences classifies proteins based on similarities of their structures and amino acid sequences. Receiver Operating Characteristic (ROC) and precision-recall curves were plotted for various results to compare BLAST results of different varying parameters. To assess the overall performance, area under the curve was calculated for each of the graphs. The results indicated a marginal difference between the performance of BLAST using default parameters and modifying the parameters.
Notes: Master of Statistics and Data Science-Bioinformatics
Document URI: http://hdl.handle.net/1942/35299
Category: T2
Type: Theses and Dissertations
Appears in Collections:Master theses

Files in This Item:
File Description SizeFormat 
c580a16f-f5e9-4ab7-b18a-a1bc2af4830a.pdf2.76 MBAdobe PDFView/Open
Show full item record

Page view(s)

118
checked on Nov 7, 2023

Download(s)

60
checked on Nov 7, 2023

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.