In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
Today, the editor will show you how to use the data mining algorithm Apriori. The knowledge points in the article are introduced in great detail. Friends who feel helpful can browse the content of the article with the editor, hoping to help more friends who want to solve this problem to find the answer to the problem. Let's follow the editor to learn more about "how to use the data mining algorithm Apriori".
I. Overview of algorithms
Apriori algorithm is the most influential algorithm for mining frequent itemsets of Boolean association rules, which is proposed by Rakesh Agrawal and RamakrishnanSkrikant. It uses an iterative method called layer-by-layer search, and the k-itemset is used to explore (kryp1)-itemsets. First, find out the set of frequent 1-itemsets. The collection is recorded as L1. L1 is used to find the set L2 of the frequent 2-itemset, while L2 is used to find L2, and so on until the k-itemset cannot be found. A database scan is required for every Lk you find. In order to improve the efficiency of generating frequent itemsets layer by layer, an important property called Apriori property is used to compress the search space. Its operation theorem lies in that one is that all non-empty subsets of frequent itemsets must also be frequent, and the other is that all parent sets of non-frequent itemsets are non-frequent.
Second, application scenarios
Apriori algorithm is widely used in consumer market price analysis, guessing customers' consumption habits, intrusion detection technology in the field of network security, can be used in university management, according to mining rules can effectively assist school management departments to carry out poverty aid work; it can also be used in the field of mobile communications to guide operators' business operations and assist service providers' decision-making.
III. Basic concepts
The two most important concepts of Apriori algorithm are support (support) and confidence (confidence):
Support: support ({An AB B}) = P (support), that is, the probability that events An and B occur at the same time.
Confidence: confidence (A = > B) = support ({A Magi B}) / support ({A}), that is, the probability that B occurs at the same time in the event of A, and the confidence level from A to B is the support degree of {A Magi B} / {A}.
Minimum confidence: a predetermined value, usually obtained from the results of several attempts of the algorithm, used to exclude the elements in each candidate set, and the frequent itemsets of the next layer have been obtained.
Minimum confidence, preset value, used to judge confidence
Strong rules: rules that satisfy both minimum support and minimum confidence are called strong rules.
Third, the realization principle
The algorithm is divided into two stages: calculating the support of each layer and calculating the confidence according to the support. This is directly illustrated by an example. There are five records in the initial set. According to the product combination in the record, we can calculate the support degree of each layer step by step, as shown in the following figure:
Support calculation process
As you can see, we can finally get three layers of support: L1, L2, L3. Next, we can calculate the confidence of each layer directly through the support. Here we take L3 as an example:
Confidence calculation process
The calculation of confidence is relatively simple, that is, according to the confidence of KMel 1 element to another element in K-layer set, the above formula can be applied directly. Here we can actually draw the rule that when BC or CE appears, E or B must appear. Of course, this is just a simple example. In practice, there must be enough samples for the results to be more reliable.
Thank you for your reading, the above is the whole content of "how to use the data mining algorithm Apriori", learn friends to hurry up to operate it. I believe that the editor will certainly bring you better quality articles. Thank you for your support to the website!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.