Empirical Study of Two Hypothesis Test Methods for Community Structure in Networks

Nan, Yehong

Empirical Study of Two Hypothesis Test Methods for Community Structure in Networks

Files

Empirical Study of Two Hypothesis Test Methods for Community Structure in Networks.pdf (416.98 KB)

Date

2019

Authors

Nan, Yehong

Publisher

North Dakota State University

Abstract

Many real-world network data can be formulated as graphs, where a binary relation exists between nodes. One of the fundamental problems in network data analysis is community detection, clustering the nodes into different groups. Statistically, this problem can be formulated as hypothesis testing: under the null hypothesis, there is no community structure, while under the alternative hypothesis, community structure exists. One is of the method is to use the largest eigenvalues of the scaled adjacency matrix proposed by Bickel and Sarkar (2016), which works for dense graph. Another one is the subgraph counting method proposed by Gao and Lafferty (2017a), valid for sparse network. In this paper, firstly, we empirically study the BS or GL methods to see whether either of them works for moderately sparse network; secondly, we propose a subsampling method to reduce the computation of the BS method and run simulations to evaluate the performance.

Keywords

Bickel and Sarkar method, community network, degree-corrected SBM, Erdős–Rényi model, Gao and Lafferty method, sparse network

URI

https://hdl.handle.net/10365/31640

Collections

Statistics Masters Theses

Full item page

Empirical Study of Two Hypothesis Test Methods for Community Structure in Networks

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections