Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/46444
Full metadata record
DC FieldValueLanguage
dc.contributor.authorCORREA ROJO, Alejandro-
dc.contributor.authorMoris, Pieter-
dc.contributor.authorMEUWISSEN, Hanne-
dc.contributor.authorMonsieurs, Pieter-
dc.contributor.authorVALKENBORG, Dirk-
dc.contributor.editorLengauer, Thomas-
dc.date.accessioned2025-07-28T07:31:25Z-
dc.date.available2025-07-28T07:31:25Z-
dc.date.issued2025-
dc.date.submitted2025-07-22T11:24:10Z-
dc.identifier.citationBioinformatics Advances, 5 (1) (Art N° vbaf143)-
dc.identifier.urihttp://hdl.handle.net/1942/46444-
dc.description.abstractThe Discriminant Analysis of Principal Components method is a pivotal tool in population genetics, combining principal component analysis and linear discriminant analysis to assess the genetic structure of populations using genetic markers, focusing on the description of variation between genetic clusters. Despite its utility, the original R implementation in the adegenet package faces computational challenges with large genomic datasets. To address these limitations, we introduce DAPCy, a Python package leveraging the scikit-learn library to enhance the method's scalability and efficiency. DAPCy supports large datasets by utilizing compressed sparse matrices and truncated singular value decomposition for dimensionality reduction, coupled with training-test cross-validation for robust model evaluation. It also includes modules for de novo genetic clustering and extensive visualization and reporting capabilities. Compared to the original R implementation, DAPCy can process genomic datasets with thousands of samples and features in less computational time and with reduced memory usage. To show DAPCy's computational capabilities, we benchmarked it with the R implementation using the Plasmodium falciparum dataset from MalariaGEN and the 1000 Genomes Project.Availability and implementation DAPCy can be installed as a Python package through pip. Source code is available on https://gitlab.com/uhasselt-bioinfo/dapcy. Documentation and a tutorial can be found on https://uhasselt-bioinfo.gitlab.io/dapcy/.-
dc.description.sponsorshipThis work was supported by the Flemish Special Research Fund (BOF) [BOF21DOC23].-
dc.language.isoen-
dc.publisherOXFORD UNIV PRESS-
dc.rightsThe Author(s) 2025. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.-
dc.titleDAPCy: a Python package for the discriminant analysis of principal components method for population genetic analyses-
dc.typeJournal Contribution-
dc.identifier.issue1-
dc.identifier.volume5-
local.format.pages6-
local.bibliographicCitation.jcatA1-
dc.description.notesRojo, AC (corresponding author), Hasselt Univ, Data Sci Inst, Interuniv Inst Biostat & Stat Bioinformat I BioSta, Agoralaan Gebouw D, B-3500 Diepenbeek, Belgium.; Rojo, AC (corresponding author), Flemish Inst Technol Res VITO, Boeretang 200, B-2400 Mol, Belgium.-
dc.description.notesalejandro.correarojo@vito.be-
local.publisher.placeGREAT CLARENDON ST, OXFORD OX2 6DP, ENGLAND-
local.type.refereedRefereed-
local.type.specifiedArticle-
local.bibliographicCitation.artnrvbaf143-
dc.identifier.doi10.1093/bioadv/vbaf143-
dc.identifier.pmid40630499-
dc.identifier.isi001524065700001-
local.provider.typewosris-
local.description.affiliation[Correa Rojo, Alejandro; Meuwissen, Hanne] Hasselt Univ, Data Sci Inst, Interuniv Inst Biostat & Stat Bioinformat I BioSta, Agoralaan Gebouw D, B-3500 Diepenbeek, Belgium.-
local.description.affiliation[Correa Rojo, Alejandro; Valkenborg, Dirk] Flemish Inst Technol Res VITO, Boeretang 200, B-2400 Mol, Belgium.-
local.description.affiliation[Moris, Pieter; Monsieurs, Pieter] Inst Trop Med, Dept Biomed Sci, Unit Malariol, B-2000 Antwerp, Belgium.-
local.uhasselt.internationalno-
item.fullcitationCORREA ROJO, Alejandro; Moris, Pieter; MEUWISSEN, Hanne; Monsieurs, Pieter & VALKENBORG, Dirk (2025) DAPCy: a Python package for the discriminant analysis of principal components method for population genetic analyses. In: Bioinformatics Advances, 5 (1) (Art N° vbaf143).-
item.contributorCORREA ROJO, Alejandro-
item.contributorMoris, Pieter-
item.contributorMEUWISSEN, Hanne-
item.contributorMonsieurs, Pieter-
item.contributorVALKENBORG, Dirk-
item.contributorLengauer, Thomas-
item.fulltextWith Fulltext-
item.accessRightsOpen Access-
crisitem.journal.eissn2635-0041-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
DAPCy_ a Python package for the discriminant analysis of principal components method for population genetic analyses.pdfPublished version855.94 kBAdobe PDFView/Open
Show simple item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.