Show simple item record

dc.contributor.authorSoumare, Ibrahim
dc.description.abstractDisease statuses and biological conditions are known to be greatly impacted by differences in gene expression levels. A common challenge in RNA-seq data analysis is to identify genes whose mean expression levels change across different groups of samples, or, more generally, are associated with one or more variables of interest. Such analysis is called differential expression analysis. Many tools have been developed for analyzing differential gene expression (DGE) for RNA-seq data. RNA-seq data are represented as counts. Typically, a generalized linear model with a log link and a negative binomial response is fit to the count data for each gene, and DE genes are identified by testing, for each gene, whether a model parameter or linear combination of model parameters is zero. We conducted a simulation study to compare the performance of our proposed modified permutation test to DESeq2 edgeR, Limma, LFC and Voom when applied to RNA-seq data. We considered different combinations of sample sizes and underlying distributions. In this simulation study, we first simulated data using Monte Carlo simulation in SAS and assessed True Detection rate and False Positive rate for each model involved. We then simulated data from real RNA-seq data using SimSeq algorithm and compared the performance of our proposed model to DESeq2 edgeR, Limma, LFC and Voom. The simulation results suggest that Permutation tests are a competitive alternative to traditional parametric methods for analyzing RNA-seq data when we have sufficient sample sizes. Specifically, the results show that Permutation controlled Type I error fairly well and had a comparable Power rate. Moreover, for a sample size n≥10 simulation exhibited a comparable True detection rate and consistently kept the False Positive rate very low when sampling from Poisson and Negative Binomial distributions. Likewise, the results from SimSeq confirm that Permutation tests do a better job at keeping the False Positive rate the lowest.en_US
dc.publisherNorth Dakota State Universityen_US
dc.rightsNDSU policy 190.6.2en_US
dc.titlePerformance of Permutation Tests Using Simulated Genetic Dataen_US
dc.typeDissertationen_US
dc.date.accessioned2023-12-20T18:33:42Z
dc.date.available2023-12-20T18:33:42Z
dc.date.issued2022
dc.identifier.urihttps://hdl.handle.net/10365/33402
dc.subjectgenetic dataen_US
dc.subjectMonte Carlo Simulationen_US
dc.subjectPermutation testsen_US
dc.subjectRNA-seq dataen_US
dc.subjectSimSeqen_US
dc.rights.urihttps://www.ndsu.edu/fileadmin/policy/190.pdfen_US
ndsu.degreeDoctor of Philosophy (PhD)en_US
ndsu.collegeScience and Mathematicsen_US
ndsu.departmentStatisticsen_US
ndsu.programStatisticsen_US
ndsu.advisorMagel, Rhonda
ndsu.advisorDoetkott, Curt


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record