Show simple item record

dc.contributor.authorAl-Sawwa, Jamil
dc.description.abstractApplying the nature-inspired methods in the data mining area has been gaining more attention by researchers. Classification is one of the data mining tasks which aims to analyze historical data by discovering hidden relationships between the input and the output that would help to predict an accurate outcome for an unseen input. The classification algorithms based on nature-inspired methods have been successfully used in numerous applications such as medicine and agriculture. However, the amount of data that has been collected or generated in these areas has been increasing exponentially. Thus, extracting useful information from large data requires computational time and consumes memory space. Besides this, many algorithms suffer from not being able to handle imbalanced data. Apache Spark is an in-memory computing big data framework that runs on a cluster of nodes. Apache Spark is more efficient for handling iterative and interactive jobs and runs 100 times faster than Hadoop Map-Reduce for various applications. However, the challenge is to find a scalable solution using Apache Spark for the optimization-based classification algorithms that would scale very well with large data. In this dissertation, we firstly introduce new variants of a centroid-based particle swarm optimization (CPSO) classification algorithm in order to improve its performance in terms of misclassification rate. Furthermore, a scalable particle swarm optimization classification algorithm (SCPSO) is designed and implemented using Apache Spark. Two variants of SCPSO, namely SCPSO-F1 and SCPSO-F2, are proposed based on different fitness functions. The experiments revealed that SCPSO-F1 and SCPSO-F2 utilize the cluster of nodes efficiently and achieve good scalability results. Moreover, we propose a cost-sensitive differential evolution classification algorithm to improve the performance of the differential evolution classification algorithm when applied to imbalanced data sets. The experimental results demonstrate that the proposed algorithm efficiently handles highly imbalanced binary data sets compared to the current variants of differential evolution classification algorithms. Finally, we designed and implemented a parallel version of a cost-sensitive differential evolution classifier using the Spark framework. The experiments revealed that the proposed algorithm achieved good speedup and scaleup results and obtained good performance.en_US
dc.publisherNorth Dakota State Universityen_US
dc.rightsNDSU policy 190.6.2en_US
dc.titleScalable Particle Swarm Optimization and Differential Evolution Approaches Applied to Classificationen_US
dc.typeDissertationen_US
dc.typeVideoen_US
dc.date.accessioned2020-09-16T17:28:10Z
dc.date.available2020-09-16T17:28:10Z
dc.date.issued2019
dc.identifier.urihttps://hdl.handle.net/10365/31538
dc.rights.urihttps://www.ndsu.edu/fileadmin/policy/190.pdfen_US
ndsu.degreeDoctor of Philosophy (PhD)en_US
ndsu.collegeEngineeringen_US
ndsu.departmentComputer Scienceen_US
ndsu.programComputer Scienceen_US
ndsu.advisorLudwig, Simone


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record