Two Data Mining Applications for Predicting Pre-Diabetes

No Thumbnail Available

Date

2015

Journal Title

Journal ISSN

Volume Title

Publisher

North Dakota State University

Abstract

In this study, the performance of Logistic Regression and Decision Tree modeling is compared by using SAS Enterprise Miner for predicting pre-diabetes in US population by using several of the common factors from the type 2 diabetes screening criteria. From 17 variables of NHANES’ three sets of dataset, a total of 13 risk factors were selected as predictors of pre-diabetes. A comparison of two data mining methodology showed that Decision Tree has a higher ROC index than Logistic Regression modeling. All ROC indexes for two models were greater than 77% indicating both methods present a good prediction for pre-diabetes. The predictive accuracy of the two models was greater than 72% on the whole dataset. Decision tree modeling also resulted in higher accuracy and sensitivity values than Logistic Regression modeling. Taken as a whole, the results of comparison indicated Decision Tree modeling is a better indicator to predict pre-diabetes.

Description

Document incorrectly classified as a dissertation on title page (decision to classify as a thesis from NDSU Graduate School)

Keywords

Citation