Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/45325
Title: | Semiparametric and adaptive statistical methods for microbiome data analysis: towards increased reproducibility | Authors: | KODALCI, Leyla | Advisors: | Thas, Olivier | Issue Date: | 2024 | Abstract: | Differential abundance (DA) analysis is an essential tool in microbiome studies that enables the identification of microbial taxa which are linked to certain conditions or diseases. However, DA analysis is posed with significant challenges of reproducibility due to the inherent features of amplicon-sequenced microbiome data, such as compositionality, sparsity, overdispersion and high dimensionality. Keeping these challenges into account is critical for ensuring reliable and unbiased findings. This dissertation presents two novel methods for DA analysis, each developed with its own perspective on addressing the particular challenges arising from the complex nature of microbiome data. The first is a semiparametric DA method that uses simple sign transformations in combination with established statistical models to test for differential abundance. This approach has the major advantage that the sign methods inherit the flexibility of these statistical models, meaning that they can adjust for covariates and confounders, without relying on strong distributional assumptions. In Chapter 2, we have shown that this approach controls the false discovery rates (FDR) at a fixed nominal level, while maintaining competitive sensitivity, and is robust under several conditions. The second method, ADAM (Adaptive Differential Abundance Method), adaptively selects the most appropriate DA method from a pre-defined set of DA methods in a data-driven way. By adjusting to the data at hand with its unique combination of characteristics, ADAM can contribute to more reproducible DA analysis. In Chapter 3, we have demonstrated that this approach controls the FDR at a fixed nominal level while maintaining competitive sensitivity across a range of scenarios. Following the development of these DA methods, we explore the evaluation and bench marking of DA methods. The diversity that exists among available DA methods, partly driven by the complex nature of microbiome data, leads to considerable heterogeneity in their evaluation and benchmarking, which negatively contributes to the reproducibility crisis in microbiome research. In Chapter 4, we present ’Neutralise’, an open-science community-driven initiative for neutral comparisons of two-sample tests. By using the very simple two-sample problem, this chapter aims to provide a proof of concept of such an initiative by focusing on the framework’s design and architecture while avoiding the added complexities associated with microbiome data. Building on this, in Chapter 5, we initiate a call to develop such a comprehensive open-science initiative for neutral comparisons of DA methods in microbiome research and address the specific challenges that come into play when extending Neutralise to microbiome data and benchmarking of DA methods. By developing statistical methodology and establishing robust benchmarking practices, this research makes an effort for more reproducible data analysis in microbiome studies. | Document URI: | http://hdl.handle.net/1942/45325 | Category: | T1 | Type: | Theses and Dissertations |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
PhD_Thesis_FINAL_LK.pdf Until 2029-11-30 | Published version | 20.47 MB | Adobe PDF | View/Open Request a copy |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.