Comparative Classification of Prostate Cancer Data using the Support Vector Machine, Random Forest, Dualks and k-Nearest Neighbours

Sakouvogui, Kekoura

dc.contributor.author	Sakouvogui, Kekoura
dc.description.abstract	This paper compares four classifications tools, Support Vector Machine (SVM), Random Forest (RF), DualKS and the k-Nearest Neighbors (kNN) that are based on different statistical learning theories. The dataset used is a microarray gene expression of 596 male patients with prostate cancer. After treatment, the patients were classified into one group of phenotype with three levels: PSA (Prostate-Specific Antigen), Systematic and NED (No Evidence of Disease). The purpose of this research is to determine the performance rate of each classifier by selecting the optimal kernels and parameters that give the best prediction rate of the phenotype. The paper begins with the discussion of previous implementations of the tools and their mathematical theories. The results showed that three classifiers achieved a comparable performance that was above the average while DualKS did not. We also observed that SVM outperformed the kNN, RF and DualKS classifiers.	en_US
dc.publisher	North Dakota State University	en_US
dc.rights	NDSU Policy 190.6.2
dc.title	Comparative Classification of Prostate Cancer Data using the Support Vector Machine, Random Forest, Dualks and k-Nearest Neighbours	en_US
dc.type	Thesis	en_US
dc.date.accessioned	2018-03-12T18:21:28Z
dc.date.available	2018-03-12T18:21:28Z
dc.date.issued	2015	en_US
dc.identifier.uri	https://hdl.handle.net/10365/27698
dc.rights.uri	https://www.ndsu.edu/fileadmin/policy/190.pdf
ndsu.degree	Master of Science (MS)	en_US
ndsu.college	Science and Mathematics	en_US
ndsu.department	Statistics	en_US
ndsu.program	Statistics	en_US
ndsu.advisor	Yang, Yarong

Files in this item

Name:: Comparative Classification of ...
Size:: 769.3Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Statistics Masters Theses

Show simple item record