Vector-Vector Patterns for Agricultural Data
Abstract
Agriculture is increasingly driven by massive data, and some challenges are not covered by existing statistics, machine learning, or data mining techniques. Many crops are characterized not only by yield but also by quality measures, such as sugar content and sugar lost to molasses for sugarbeets. The set of features furthermore contains time series data, such as rainfall and periodic satellite imagery. This study examines the problem of identifying relationships in a complex data set, in which there are vectors (multiple attributes) for both the explanatory and response conditions. This problem can be characterized as a vector-vector pattern mining problem. The proposed algorithm uses one of the vector representations to determine the neighbors of a randomly picked instance, and then tests the randomness of that subset within the other vector representation. Compared to conventional approaches, the vector-vector algorithm shows better performance for distinguishing existing relationships.