Now showing items 1-11 of 11

    • Analyzing Access Logs Data using Stream Based Architecture 

      Gautam, Nitendra (North Dakota State University, 2018)
      Within the past decades, the enterprise-level IT infrastructure in many businesses have grown from a few to thousands of servers, increasing the digital footprints they produce. These digital footprints include access logs ...
    • A Comparative Study on Different Big Data Tools 

      Ibtisum, Sifat (North Dakota State University, 2020)
      Big data has long been the topic of fascination for computer science enthusiasts around the world, and has gained even more prominence in recent times with the continuous explosion of data resulting from the likes of social ...
    • K-Anonymization Implementation Using Apache Spark 

      Tortikar, Pratik (North Dakota State University, 2019)
      This experiment attempts on data which can reveal a person’s identity to anonymize with k-1 anonymity principle. "Given person-specific field-structured data, produce a release of the data with scientific guarantees that ...
    • A Map Reduce Approach of K-Means++ Algorithm with Initial Equidistant Centers 

      Bhattacharyya, Krittika (North Dakota State University, 2015)
      Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the ...
    • Market Basket Analysis Algorithm with MapReduce Using HDFS 

      Nuthalapati, Aditya (North Dakota State University, 2017)
      Market basket analysis techniques are substantially important to every day’s business decision. The traditional single processor and main memory based computing approach is not capable of handling ever increasing large ...
    • Mining Quasi-Frequent Subnetworks in Graph Networks Using Edge-Edge Summary Graph 

      Dawar, Priyanka (North Dakota State University, 2017)
      In today’s computing world, graphs have become increasingly important in modeling sophisticated structures, entities and their interactions, with broad applications including Bioinformatics, Computer Vision, Web analysis ...
    • Naïve Bayes Classifier: A MapReduce Approach 

      Zheng, Songtao (North Dakota State University, 2014)
      Machine learning algorithms have the advantage of making use of the powerful Hadoop distributed computing platform and the MapReduce programming model to process data in parallel. Many machine learning algorithms have been ...
    • Parallelization of Particle Swarm Optimization Algorithm Using Hadoop Mapreduce 

      Ghosh, Priyanka Singh (North Dakota State University, 2016)
      Particle Swarm Optimization (PSO) has received attention in many research fields and real-world applications for solving optimization problems in the areas of intelligent transportation systems, wireless sensor networks, ...
    • Performance Comparison of Apache Spark MLlib 

      Sharma, Pallavi (North Dakota State University, 2018)
      This study makes an attempt to understand the performance of Apache Spark and the MLlib platform. To this end, the cluster computing system of Apache Spark is set up and five supervised machine learning algorithms (Naïve-Bayes, ...
    • Stock Price Prediction Using Recurrent Neural Networks 

      Jahan, Israt (North Dakota State University, 2018)
      The stock market is generally very unpredictable in nature. There are many factors that might be responsible to determine the price of a particular stock such as the market trend, supply and demand ratio, global economy, ...
    • Study of Similarity Coefficients Using MapReduce Programming Model 

      Nayakam, GhanaShyam Nath (North Dakota State University, 2013)
      MapReduce is a programming model for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that ...