Mining Approximate Frequent Dense Modules from Multiple Gene Expression Datasets

Seo, San Ha

Mining Approximate Frequent Dense Modules from Multiple Gene Expression Datasets

Files

Mining Approximate Frequent Dense Modules from Multiple Gene Expression Datasets.pdf (1.16 MB)

Date

2021

Authors

Seo, San Ha

Publisher

North Dakota State University

Abstract

Large amount of gene expression data has been collected for various environmental and biological conditions. Extracting dense modules that are recurrent in multiple gene coexpression networks has been shown to be promising in functional gene annotation and biomarkers discovery. In this thesis, we propose a biclustering-based approach for mining approximate frequent dense modules. This approach reports a large number of modules with many duplicate modules. Thus, we build on this approach and propose two extended approaches for mining dense modules, which mine set of representative patterns using post-processing and on-line pattern summarization methods. The extended approaches report smaller number of modules and less duplicate modules. Experiments on real gene coexpression networks show that frequent dense modules are biologically interesting as evidenced by the large percentage of biologically enriched frequent dense modules.

Keywords

bioinformatics, gene coexpression network, graph mining

URI

https://hdl.handle.net/10365/32307

Collections

Computer Science Masters Theses

Full item page

Mining Approximate Frequent Dense Modules from Multiple Gene Expression Datasets

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections