Bayesian Sparse Factor Analysis of High Dimensional Gene Expression Data
Abstract
This work closely studied fundamental techniques of Bayesian sparse Factor Analysis model - constrained Least Square regression, Bayesian Lasso regression, and some popular sparsity-inducing priors. In Appendix A, we introduced each of the fundamental techniques in a coherent manner and provided detailed proof for important formulas and definitions. We consider provided introduction and detailed proof, which are very helpful in learning Bayesian sparse Factor Analysis, as a contribution of this work. We also systematically studied a computationally tractable biclustering approach in identifying co-regulated genes, BicMix, by proving all point estimates of the parameters and by running the method on both simulated data sets and a real high-dimensional gene expression data set. Missed derivation of all point estimates in BicMix has been provided for better understanding variational expectation maximization (VEM) algorithm. The performance of the method for identifying true biclusters has been analyzed using the experimental results.