Hits:
Indexed by:期刊论文
Date of Publication:2014-05-20
Journal:Journal of Information and Computational Science
Included Journals:EI、Scopus
Volume:11
Issue:8
Page Number:2809-2816
ISSN No.:15487741
Abstract:Mainstream parallel algorithms for mining frequent itemsets (patterns) were designed by implementing FP-Growth or Apriori algorithms on MapReduce (MR) framework. Existing MR FP-Growth algorithms can not distribute data equally among nodes, and MR Apriori algorithms utilize multiple map/reduce procedures and generate too many key-value pairs with value of 1; these disadvantages hinder their performance. This paper proposes an algorithm FIMMR: it firstly mines local frequent itemsets for each data chunk as candidates, applies prune strategies to the candidates, and then identifies global frequent itemsets from candidates. Experimental results show that the time efficiency of FIMMR outperforms PFP and SPC significantly; and under small minimum support threshold, FIMMR can achieve one order of magnitude improvement than the other two algorithms; meanwhile, the speedup of FIMMR is also satisfactory. Copyright ? 2014 Binary Information Press.