Mining Approximate Frequent Dense Modules from Multiple Gene Expression Datasets

Seo, San Ha

Author/Creator

Seo, San Ha

More Information

Show full item record

View/Open

Mining Approximate Frequent Dense Modules from Multiple Gene Expression Datasets (1.160Mb)

Abstract

Large amount of gene expression data has been collected for various environmental and biological conditions. Extracting dense modules that are recurrent in multiple gene coexpression networks has been shown to be promising in functional gene annotation and biomarkers discovery. In this thesis, we propose a biclustering-based approach for mining approximate frequent dense modules. This approach reports a large number of modules with many duplicate modules. Thus, we build on this approach and propose two extended approaches for mining dense modules, which mine set of representative patterns using post-processing and on-line pattern summarization methods. The extended approaches report smaller number of modules and less duplicate modules. Experiments on real gene coexpression networks show that frequent dense modules are biologically interesting as evidenced by the large percentage of biologically enriched frequent dense modules.

URI

https://hdl.handle.net/10365/32307

Collections

Computer Science Masters Theses