In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
The content of this article mainly focuses on how to understand the Apriori algorithm. The content of the article is clear and well-organized. It is very suitable for beginners to learn and is worth reading. Interested friends can follow the editor to read together. I hope you can get something through this article!
1. The purpose of Apriori algorithm:
It is mainly used to mine association rules, that is, to find frequent itemsets from a transaction dataset and derive association rules. Its name is because the algorithm is based on a priori knowledge (prior knowledge). Generate the frequent items this time according to the frequent items found last time. Apriori is the core algorithm in association analysis.
The characteristics of Apriori algorithm:
Can only deal with classified variables, not numeric variables.
The data store can be in the transaction data format (transaction table) or in the form of a fact table (table data).
The core of the algorithm is designed to improve the efficiency of association rules.
two。 New concept:
Itemset: (for example, all items) is an itemset, then each transaction t (such as buying items on a receipt) is an itemset.
Support: the support of an itemset refers to the proportion of transactions containing the itemset to all transactions.
Frequent itemsets: itemsets that meet a given minimum degree of support.
Association rules: X-> Y means that Y can be derived from X
Confidence: for the confidence of X-> Y, it means p (XMagi Y) / p (X); that is, the proportion of transactions in itemset X transactions that also contain itemset Y.
3. Apriori's thought:
We hope that confidence and support can be regarded as effective rules only if they meet our threshold range. In practice, we often face a lot of data. If it is only a simple search, there will be a lot of rules, a large part of which are invalid rules, and the efficiency is very low. Then Apriori can improve efficiency by generating frequent itemsets and then generating rules based on frequent itemsets.
The above represents the two steps of the Apriori algorithm: generating frequent itemsets and generating rules based on frequent itemsets.
Why determine frequent itemsets?
As I just said, the degree of support must be greater than that specified by us, that is to say, it can be determined that the rules generated later are generated on the basis of universal representativeness, because the level of support itself represents whether the results of our association analysis are universal or not.
Thank you for your reading, I believe you have a certain understanding of "how to understand the Apriori algorithm", go to practice quickly, if you want to know more related knowledge points, you can follow the website! The editor will continue to bring you better articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.