Browsing by Author "Liu, Xiyuan"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Conditional Random Field with Lasso and its Application to the Classification of Barley Genes Based on Expression Level Affected by Fungal Infection(North Dakota State University, 2019) Liu, XiyuanThe classification problem of gene expression level, more specifically, gene expression analysis, is a major research area in statistics. There are several classical methods to solve the classification problem. To apply Logistic Regression Model (LRM) and other classical methods, the observations in the dataset should fit the assumption of independence. That is, the observations in the dataset are independent to each other, and the predictor (independent variable) should be independent. These assumptions are usually violated in gene expression analysis. Although the Classical Hidden Markov Chain Model (HMM) can solve the independence of observation problem, the classical HMM requires the independent variables in the dataset are discrete and independent. Unfortunately, the gene expression level is a continuous variable. To solve the classification problem of Gene Expression Level data, the Conditional Random Field(CRF) is introduce. Finally, the Least Absolute Selection and Shrinkage Operator (LASSO) penalty, a dimensional reduction method, is introduced to improve the CRF model.Item Estimating Return on Initial Public Offering Using Mixtures of Regressions(North Dakota State University, 2015) Liu, XiyuanFinancial advisors working in a stock exchange market are often faced with a situation to convince a client of merits of investing in a company that just entered the market. To predict company's return based on its revenue, a simple linear regression may be used. This thesis finds that a model based on a mixture regressions is superior over a simple linear regression. The error term in each regression component is assumed to follow standard Gaussian distribution. The data is tested on 116 companies that entered the market as Initial Public Offering (IPO). A 2-component mixture regressions is found to provide the best fit for the data. A simulation study is conducted to verify the performance of this model. Optimum number of components is found using Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) as well as the parametric bootstrapping of the likelihood ratio test statistics.