How to use Python to realize Intelligent recommendation 07/11 Update SLTechnology News&Howtos

How to use Python to realize Intelligent recommendation

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to use Python to achieve intelligent recommendation, the content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Intelligent recommendation is "customer demand"-oriented and brings value to customers. Common examples such as Taobao's "you may still like it" and Amazon's "users who buy this product also buy it" is an example. Today let's take a look at how to use Python to implement intelligent recommendation algorithms.

Research direction: Python

Common recommendation systems and algorithms

Common * * recommendation system categories * * are:

Based on application areas: e-commerce / social friend recommendation, etc.

Based on design idea: recommendation based on collaborative filtering, etc.

Based on usage data: recommendations based on user tags, etc.

"Jingteng" cooperates to build user portrait label map

Common * * recommendation algorithms * * are:

This article will focus on the shopping basket recommendation based on association rules, which is the easiest to understand and very classic. The analysis of the correlation degree of goods is of great help to improve the vitality of goods, tap the purchasing power of consumers and promote maximum sales. The modeling concept is that the mode in which items are purchased at the same time reflects the customer's demand pattern, * * applicable scenarios * *: scenarios that do not require personalized customization; products with sales records are recommended to old customers; package design and product placement.

A brief introduction to the shopping basket

Q: what is a shopping basket? What scenarios are mainly used?

A: the synthesis of goods purchased by a single customer at a time is called a shopping basket, that is, the consumption receipt of a customer this time. Common scenes: supermarket shelf layout: complementary and exclusive; package design.

Q: what are the common algorithms of shopping baskets?

A: the common algorithms are

Regardless of shopping order: Association rules. Shopping basket analysis is actually a causal analysis. In fact, association rules are a very convenient algorithm to find the relationship between two goods. The relationship of co-promotion indicates that the two are positively related and can be sold as complements, such as bean paste and spring onions. The concept of a substitute is that if I buy this, I don't have to buy another one.

Consider the shopping order: the sequential model. It is often used in e-commerce, for example, today you add this item to the shopping cart, and a few days later, you add another item to the shopping cart, which has a sequence. However, many brick-and-mortar stores cannot record the consumption order of users because they do not have real-name authentication.

Q: what is the use of finding complementarities and repulsions for layout?

A: after the association relationship between goods according to the association rules, it may be found that there are three relationships among goods: strong association, weak association and exclusion. Each kind of sobriety has its own layout.

Strong correlation: the value of correlation degree needs to be determined according to the actual situation, and different formats are different in different industries. The display of strongly related goods to each other will increase sales on both sides. Two-way related goods should be associated with the display if the display location permits, that is, there will be A next to product An and B next to product B, such as common shaving cream and razor, men's hair oil and styling comb. For those one-way related goods, only the related goods need to be displayed next to the related goods, such as paper cups next to large bottles of cola, but not large bottles of cola next to paper cups. After all, consumers who buy big colas are more likely to need paper cups, while customers who buy paper cups are less likely to buy big colas.

Weak correlation: goods with low correlation degree can try to put them together, and then analyze whether the correlation degree has changed. If the correlation degree increases greatly, it shows that the original weak correlation may be caused by the display.

Exclusion relationship: two products basically do not appear in the same shopping receipt, this kind of goods should not be displayed together as far as possible.

According to the information of the shopping basket to analyze the commodity correlation degree is not only the above three relationships, they only represent one aspect of the commodity correlation degree analysis (credibility). Comprehensive and systematic commodity correlation analysis must have the concept of three degrees, including * support * *, * * credibility * * and * * improvement * *.

Association rules

It is difficult to understand directly according to the concept defined by the third degree of relevance, especially the question of "who is right who" in the degree of credibility improvement. In fact, we can look at it in another way:

Support for Rule X = the number of transactions for Rule X / the total number of transactions. Understanding: the degree of support indicates whether rule X is universal.

The confidence of rule X (A → B) = the number of transactions of rule X / the number of transactions of commodity B in rule X. Understanding: confidence is a conditional probability, which indicates the probability that a customer who has purchased product A will buy product B.

To make it easier to understand these rules, let's practice with the following five shopping basket examples

It is not difficult to find that the denominator of support is 5, that is, the number of shopping baskets, and the numerator is the number of times that all items in this rule appear in one basket at the same time. Take A-> D as an example, there are two baskets containing both An and D, and the total number of transactions (the total number of baskets) is 5, so the support degree of rule A-> D is 2 picks 5; the number of baskets with commodity An is 3, and among the three baskets, 2 baskets also contain commodity D, so the confidence (credibility) of the rule is 2pm 3. With regard to association rules, there are two more questions to add.

Q: does it only depend on support and confidence?

A: look at a case: the canteen sells rice. Among the 1000 records of playing rice, there are 800person-times buying rice, 600person-time buying beef, and 400person-time buying together. Then it can be concluded that for the rule (beef-> rice) * * Support * * = P (beef & rice) = 400,000,0.40. * * Confidence * * = P (rice | beef) = 400 × 600 × 0.67 * * confidence * * and * * support * * are high, but is it meaningful to recommend rice to people who buy beef? It obviously doesn't make any sense. Because the probability that users buy rice without any conditions: P (rice) = 800max 1000mm 0.8, is already greater than the probability of buying rice after buying beef is 0.67. After all, rice is already more popular than beef.

This case leads to the concept of * * degree of improvement * *: degree of improvement = confidence / unconditional probability * = 0.67 max 0.8. Rule X (A → B) has a degree of improvement of n: if B is recommended to a customer who buys A, the probability of that customer buying B is about n × 100% of the probability that TA naturally buys B. Life understanding: consumers seldom buy corner anti-collision sponges alone, and may occasionally think of it or think of it only when their children encounter it. If we add the recommendation of corner anti-collision sponges to the successful order page of the table (desk and table), it can greatly improve the sales of anti-collision sponges. This is also in line with our aim to drive relatively non-best-selling goods through best-selling goods.

Q: apart from the meaning of the formula, is there anything else related to the third degree of correlation (support, confidence, promotion)?

A: it can be understood like this:

The degree of support indicates whether the share of this group of related goods is large enough.

Confidence (credibility) represents the strength of correlation.

The degree of promotion is to see whether the association rule is valuable and worth popularizing, and how much better it is to use (recommended by customers after purchase) than useless (customers buy naturally).

Therefore, 1.0 is a cut-off value of the degree of improvement, and it is not difficult to understand that the degree of improvement of rice recommended to users who bought beef is less than 1 in the case of buying rice. In addition, two items with high confidence (assuming 100%, which means they always appear in pairs), but if support is low (meaning a low share), it won't be of much help to overall sales.

Python actual combat based on Apriori algorithm

As the research on algorithms such as Apriori is very mature, we can directly call existing functions without step-by-step calculation when using Python. The main purpose is to * * understand the principle behind and compare the advantages and disadvantages of different algorithms.

Exploratory analysis

First of all, import the relevant database and analyze the data exploratory.

Data parameter interpretation

OrderNumber: customer nickname

LineNumber: purchase order. For example, the first three lines represent the order of three items purchased by the same customer.

Model: trade name

Next, let's take a look at the types of goods.

Let's take a look at the top 15 best-selling products.

And do some simple visualization.

Using Apriori algorithm to solve Association rules

First, * * generate a shopping basket * *, and put all the items purchased by the same customer into the same shopping basket. You need to use pip install Apriori installation in advance. Then we use the dataconvert function in the Apriori package. The following parameters need to be explained.

Arulesdata: dataset-DataFrame

Tidvar: "classified index", that is, the standard for dividing shopping baskets. This case is based on the customer OrderNumber-object type.

Itemvar: what do you put in the basket? in this case, you put the goods in the dataset, that is, the Model column, into the basket-- object type.

Data_type: default selection 'unchanged provided in the inverted', library

Note: you need to pay attention to the types of parameters passed in. As long as you get it right, it is not difficult to apply it directly.

Now check the items in the first five shopping baskets

Now * * generate association rules * *. According to the permutation and combination, we can see that these transactions will produce 21255 × 21254 / 2 so many association rules. First of all, it is necessary to meet the requirements of support, if it is too small, it can be deleted directly, and the size of support can be adjusted according to the number of association rules. if there are few association rules, the requirements of support can be relaxed according to the actual situation. Description of relevant parameters:

MinSupport: minimum support threshold

MinConf: minimum confidence threshold

Minlen: rule minimum length

Maxlen: rule maximum length, usually 2 is enough

Here, the lower the minSupport or minConf setting, the more rules are generated and the greater the amount of computation

The results show that taking the first behavior of result as an example

Lhs: known as the left-handed rule, it is commonly understood that the goods purchased by users-mountain bike inner tubes

Rhs: known as the right-hand rule, it is commonly understood that another item is recommended based on the purchase of something by the user-ll mountain tire.

Support: support, the probability that the inner tube of the mountain bike and the ll mountain tire appear in the same shopping receipt at the same time

Confidence: confidence, probability of buying ll mountain tyres at the same time under the premise of buying mountain bike inner tubes

Lift: if you recommend ll mountain tyres to customers who buy mountain bike inner tubes, the probability that this customer will buy ll mountain tyres is about 400% of the probability that this customer naturally buys ll mountain tyres, that is, more than 300% higher!

Now we * * screen complementarities and mutexes * *, the code is as follows

A simple analysis of the results, do not expect each rule to be meaningful, should be combined with business thinking, such as racing track bicycles and sports kettle mutually exclusive is normal, racing speed pay attention to lightweight, but also equipped with a kettle for what. For example, a mountain bike is equipped with a sports helmet for racing highway vehicles. Mutually exclusive products appear in pairs!

Recommend products according to the results of association rules

Need to be combined with business needs

Get the maximum marketing response? Look at the confidence, the higher the better

Maximize sales? Look at the degree of improvement, the higher the better

Users do not produce consumption, we recommend products to them?

With the goal of achieving the highest marketing correspondence rate

If a new customer has just placed an order for the product * Mountain Bike Yingqi *, if you want to * * get the highest marketing response rate * *, which product should you recommend most on his paid success page?

Goal: to achieve the highest marketing response rate

Aim to maximize overall sales

If a new customer has just placed an order for the product * Mountain Yingqi * and wants to * * maximize overall sales * *, what products should be recommended on his paid successful page? For those who are interested in Python or novice beginners who don't know how to learn, you can join my Weixin: itz992 from basic python scripts to web development, crawlers, django, data mining, etc., zero basic to project actual combat materials have been sorted out. Share it with every python friend! Share with you some learning methods and small details that need to be paid attention to every night.

Goal: to maximize sales

Once again, the popular meaning of the degree of promotion: * the degree of promotion is relative to the natural purchase, A understands that the degree of promotion of B is 4.0 as follows: recommend B to the user who buys A. the probability of the user buying B is 400% of the probability that the user buys B alone (that is, naturally) to recommend B to the user who buys A. The probability of the user buying B is 300% higher than that of the user buying B alone (that is, buying it naturally).

The user does not produce consumption, recommend a certain product for it

Conclusion: Apriori algorithm based on association rules is one of the classic applications in the field of intelligent recommendation, which is simple and easy to use.

On how to use Python to achieve intelligent recommendation to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.