metaGEENOME: an integrated framework for differential abundance analysis of microbiome data in cross-sectional and longitudinal studies

Abdelkader, Ahmed; Ferdous, Nur A.; El-Hadidi, Mohamed; BURZYKOWSKI, Tomasz; Mysara, Mohamed

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/46502

Title:	metaGEENOME: an integrated framework for differential abundance analysis of microbiome data in cross-sectional and longitudinal studies
Authors:	Abdelkader, Ahmed Ferdous, Nur A. El-Hadidi, Mohamed BURZYKOWSKI, Tomasz Mysara, Mohamed
Issue Date:	2025
Publisher:	BMC
Source:	BMC bioinformatics, 26 (1) (Art N° 189)
Abstract:	BackgroundDetecting biomarkers is a key objective in microbiome research, often done through 16S rRNA amplicon sequencing or shotgun metagenomic analysis. A critical step in this process is differential abundance (DA) analysis, which aims to pinpoint taxa whose abundance significantly differs between groups. However, DA analysis remains challenging due to high dimensionality, compositionality, sparsity, inter-taxa correlations, uneven abundance distributions, and missing values-all which hinder our ability to model the data accurately. Despite the availability of many DA tools, balancing high statistical power with effective false discovery rate (FDR) control remains a major limitation.ResultsHere, we introduce a novel approach for DA analysis that integrates counts adjusted with Trimmed Mean of M-values (CTF) normalization and Centered Log Ratio (CLR) transformation with Generalized Estimating Equation (GEE) model. We benchmarked our approach against eight widely used tools employing both simulated and real datasets in cross-sectional and longitudinal settings. While several tools (e.g. MetagenomeSeq, edgeR, DESeq2 and Lefse) achieved high sensitivity, they often failed to adequately control the FDR. In contrast, our method demonstrated high sensitivity and specificity when compared to other approaches that successfully controlled the FDR, including ALDEx2, limma-voom, ANCOM, and ANCOM-BC2.ConclusionsOur approach effectively addresses key challenges in microbiome data analysis across both cross-sectional and longitudinal designs. Integrated into the R package metaGEENOME (https://github.com/M-Mysara/metaGEENOME), our framework provides a flexible, scalable and statistically robust solution for DA analysis, offering improved FDR control and enhanced performance for biomarker discovery in microbiome studies.
Notes:	Mysara, M (corresponding author), Nile Univ, Ctr Informat Sci CIS, Sch Informat Technol & Comp Sci ITCS, Bioinformat Res Grp, Giza, Egypt. mmaysara@nu.edu.eg
Keywords:	Microbiome;Differential abundance;Biomarkers;Repeated measures;False discovery rate
Document URI:	http://hdl.handle.net/1942/46502
ISSN:	1471-2105
e-ISSN:	1471-2105
DOI:	10.1186/s12859-025-06217-x
ISI #:	001532017400002
Datasets of the publication:	10.6084/m9.figshare.29615307
Rights:	The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Category:	A1
Type:	Journal Contribution
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
s12859-025-06217-x.pdf	Published version	2.13 MB	Adobe PDF	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM