Credit Card Fraud Detection Predictive Modeling
Abstract
Finance fraud is a growing problem with consequences in the financial industry and data mining has been successfully applied to huge volume of complex financial datasets to automate and analyze credit card frauds in online transactions. Data Mining is challenging process due to two major reasons–first, profiles of normal and fraudulent behaviors change frequently and second, card fraud data sets are highly skewed. This paper investigates and checks the performance of Random Forest Classifier, AdaBoost Classifier, XGBoost Classifier and LightGBM Classifier on highly skewed credit card fraud data. Dataset of credit card transactions is sourced from European cardholders containing 284,786 transactions. These techniques are applied on the raw and preprocessed data. The performance of the techniques is evaluated based on accuracy, sensitivity, specificity, precision. The results indicate about the optimal accuracy for Random Forest, AdaBoost, XGBoost and LightGBM classifiers are 85%, 83%, 97.4%, and 93% respectively.