Classifying Gene Coexpression Networks Using Discrimination Pattern Mining

No Thumbnail Available

Date

2016

Journal Title

Journal ISSN

Volume Title

Publisher

North Dakota State University

Abstract

Several algorithms for graph classi cation have been proposed. Algorithms that map graphs into feature vectors encoding the presence/absence of speci c subgraphs, have shown excellent performance. Most of the existing algorithms mine for subgraphs that appear frequently in graphs belonging to one class label and not so frequently in the other graphs. Gene coexpression networks classi cation attracted a lot of attention in the recent years from researchers in both biology and data mining because of its numerous useful applications. The advances in high-throughput technologies that provide an easy access to large microarray datasets necessitated the development of new techniques that can scale well with large datasets and produce a very accurate results. In this thesis, we propose a novel approach for mining discriminative patterns. We propose two algorithms for mining discriminative patterns and then we use these patterns for graph classi cation. Experiments on large coexpression graphs show that the proposed approach has excellent performance and scales to graphs with millions of edges. We compare our proposed algorithm to two baseline algorithms and we show that our algorithm outperforms the baseline techniques with a very high accurate graph classi cation. Moreover, we perform topological and biological enrichment analysis on the discriminative patterns reported by our mining algorithm and we show that the reported patterns are signi cantly enriched.

Description

Keywords

Citation