Research and Implementation of Mining frequent itemsets Apriori algorithm and FP-Growth algorithm based Python implementation

Summary

With the rapid increase in technology and the rapid development of storage technology, allows us to live in the ocean of data, businesses are accumulating massive amounts of data. How do we use the data to discover hidden in one of the valuable information is an important factor in data mining born. In the 1990s, Wal-Mart found in its massive transaction data in the classic "beer and diapers Story", reveals some shopping habits of Americans, and adjust the layout according to the characteristics, significantly increasing profits, the process is called "Market Basket analysis." This is the data mining success stories early in the practical application of the origin is also frequent item set mining.

Now, there is growing emphasis hidden under the massive data potential value of data mining techniques continue to be used to the Internet, telecommunications, financial, business and other fields. Among them, the use of frequent item set mining technology in the field of business has become an important research topic.

The following are the main contents of this paper:

  1. The relevant concepts and theories frequent item set mining made specific exposition;
  2. Details of the two most classic frequent item set mining algorithms: apriori algorithm and fp-growth algorithm;
  3. Respectively, to achieve these two algorithms, the iteration in detail how to generate all apriori frequent itemsets, the construction process fp-growth fp-tree, and to explore how frequent itemsets by fp-tree, is calculated according to the association rules frequent itemsets algorithm operating results to be analyzed.
  4. Comparative analysis of apriori algorithm fp-growth algorithm and their advantages and disadvantages and application scenarios.

Keywords: data mining; Apriori algorithm; FP-Growth algorithm; Basket Analysis

 

 

 

Guess you like

Origin blog.csdn.net/asdJJkk/article/details/93377190