Association Rule Mining of Biological Field Data Sets
Abstract
Association rule mining is an important data mining technique, yet, its use in association analysis of biological data sets has been limited. This mining technique was applied on two biological data sets, a genome and a damselfly data set. The raw data sets were pre-processed, and then association analysis was performed with various configurations. The pre-processing task involves minimizing the number of association attributes in genome data and creating the association attributes in damselfly data. The configurations include generation of single/maximal rules and handling single/multiple tier attributes. Both data sets have a binary class label and using association analysis, attributes of importance to each of these class labels are found. The results (rules) from association analysis are then visualized using graph networks by incorporating the association attributes like support and confidence, differential color schemes and features from the pre-processed data.