Practice and thinking of Personalized recommendation system (part I) 07/11 Update SLTechnology News&Howtos

Practice and thinking of Personalized recommendation system (part I)

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

The content of this paper comes from Shenze data "Intelligent recommendation-Analysis of Application scenarios and Technical difficulties". The content will be shared behind closed doors, shared by Hu Shiwen, an expert in Shenze data algorithm, and the sharing theme is "practice and thinking of recommendation system".

Hello, everyone. Before the beginning of the speech, I conducted a small survey of all of you here. I learned that the problems commonly encountered in your work related to the recommendation system are: "the data is too sparse, the data does not form a closed loop, the data cannot be combined with other systems," and so on. These are the practical problems in front of us, so when we really start to do a recommendation system, How many aspects do you need to consider?

First, the algorithm. What kind of algorithm should be chosen? Whether it is collaborative filtering or other algorithms, they should be based on their own business products.

Second, data. What kind of data should be selected when the algorithm is determined? How to process data? What method is used to collect data? There is a saying called "machine learning = model + data". Even if you have a very complex model, it will not work well in the recommendation system when there is something wrong with the data.

Third, online services. When the model is trained and the data is fully prepared, it will be faced with receiving the user's request to return the recommendation result, which contains two questions. First, the return response should be fast enough. If the recommendation result is returned only one second after a user request, the user is likely to lose patience. Second, how to make the recommendation system highly scalable. When DAU rises from 100, 000 to 1 or 2 million, can the recommendation system block a large number of requests as well as it did at first? These are the problems that need to be considered and faced in online services.

Fourth, evaluate the effect. Doing the above three points well does not mean that everything will be fine. on the one hand, we should continue to iterate the model and structure of the recommendation algorithm, and on the other hand, we should build a relatively complete and systematic evaluation system and evaluation methods. to analyze the current situation and follow-up development of the recommendation effect.

I will share with you some of the problems we have encountered in the actual situation and the solutions summed up from the above four aspects.

I. algorithm

Among all kinds of algorithms, the most easy to think of is a tag-based method.

As shown in the above figure, labels can be divided into two types.

The first one is the user tag. Suppose we have part of the user tags and know the age, gender and other information of each user, when users of a certain age and gender like an item, we can recommend the item to other users with the same age, gender and other user tags.

The second kind is content tags. Similar to the idea of user tags, if a user likes an item with a content tag, we can recommend content with the same tag.

But it is clear that this tag-based approach has an important drawback-it requires a sufficient number of tags. Perhaps in multiple products, there may be no labels or the number of labels is very sparse, so the labeling method is obviously not enough to deal with.

In addition, collaborative filtering is also a very classic and mentioned method by many people, and it is a common and effective way of thinking.

With the continuous development of technology, deep learning has been discussed and studied repeatedly by almost the whole machine learning community since 2012. Google proposed a recommendation model based on deep learning in 2016, which uses deep learning to solve recommendation problems and uses users' behavior data to build recommendation algorithms.

1. One of the purposes of deep learning: vectorization

The recommendation system is actually doing something about "matching", matching people and things. In fact, the recommendation system, which seems very difficult, also has a simple idea-to match people and things, and recommend items that the user may be interested in to him or her. If we think about this problem from a mathematical point of view, how can we calculate the similarity match between people and things?

In the field of recommendation, one of the purposes of deep learning is to try to vectorize people and objects, that is, to learn a person and an object into a unified representation, and then calculate the similarity between the person and the object in this unified representation. when people and objects are mapped to the same comparable space, the relevant content recommendations can be performed based on the calculation results.

When the final result is mapped to this two-dimensional flat chart, the user thinks that similar content will be mapped on the vector, and then the user can be mapped in after having the content vector. For example, when the user arrives somewhere in the picture above, he can push education, entertainment, science, geography and other content to him according to his location.

At this point, some friends will ask: since deep learning is so complicated, is it useful in practice? In fact, from the perspective of practical experience, when you have a certain amount of data, it will bring more obvious effect improvement, but when you want to build a deep learning model, you may really encounter a lot of problems. For example:

How much data can be used to train the model? What format should be used for training data? How much "depth" can be considered as a depth model? What to do if the training model is too slow. These difficulties and solutions encountered in building a deep learning model will be shared with you later.

two。 Cold start

Cold start is a problem often encountered in the algorithm part. in the cold start phase, the data is relatively sparse, so it is difficult to use the user's behavior data to achieve personalized recommendation. There are two kinds of cold start problems: the cold start of new content and the cold start of new users. Next, let's share how to implement the cold start of new content.

For example, the requirement of an information scenario is to distribute new content (such as content published within 10 minutes) to the user's recommendation results in a real-time and personalized manner.

The above article was sent at 17:41, so you need to make some personalized recommendations according to the content of this article in a very short time. The content of this article focuses on food. After users click on this article, there will be some content related to food in the relevant recommendations of the article. When we want to achieve recommendation results in such a real-time environment, there is no way to rely on user behavior.

At this time, we try to provide a way of thinking, a semantic understanding model based on deep learning.

There is a big difference between this model and the content we shared earlier-there is no need for user behavior, only the analysis of user text, based on the user's content to generate a vector for each article. This also has something in common with the above-mentioned model, first, using the idea of deep learning to solve the problem, and second, using the idea of vectorization to solve the problem. We only need to train the semantic vector of the article and obtain the similarity between the article and the article, so as to know the correlation between the article and the user.

3. Recall, sort, rule

Today's recommendation system has been very complex, especially in some large-scale application scenarios, such as Jinri Toutiao's Feed stream, Taobao's "guess you like", etc., all have a very complex recommendation system, each module of this recommendation system may involve a lot of experimental algorithms, in a system, the emergence of 10 or 20 models is very common. So how can these models be effectively integrated into a real system?

Recall

Recall, that is, from a large amount of content to recall every user may be interested in the content, as long as there is a large amount of content, because when the content is insufficient, there is no need to build a complex recommendation system. Therefore, when there is a large amount of Item, we need to use the recall algorithm to generate content that may be of interest to the user from different categories of content. For example, if a user likes both sports content and military content, then in the first step, no matter which model is used, he hopes to generate some sports and military-related content for the user. Another user may like food and games, in the recall phase, we hope to use the model to generate some food and game-related content for him.

There may be many models during the recall phase. After the recall phase, although the content that may be of interest to the user is generated, it is actually not integrated and is in a state of disorder.

Sort

Sort, the content that is about to be recalled is sorted uniformly. The sorting process is actually the process of grading each part of the content, predicting each user's interest in each part of the content, so as to know each user's preference for each part of the content.

Rules

Recommendation systems are often closely linked to products or business scenarios, and there must be some requirements in the product that cannot be solved by models. Because the model can only explore the relationship between users and objects from user behavior or text content, some common business requirements are implemented through rules. For example, some operational selections will appear in some recommendation scenarios, and the requirement of operational colleagues is to ensure that there is one editorial selection in every ten pieces of content, and this requirement can only be realized through rules, not algorithms.

A more complex recommendation system is usually divided into three steps: recall, sorting and rules. First of all, recall the content that users are interested in, second, generate a sorted list for users, and third, use rules to solve some product and operational requirements.

2. Data

There is always a saying that "the effect of the recommendation algorithm is determined by the model and data", that is, the model only accounts for a part of the recommendation effect, and the other very important part is the data. So what kind of data do we need? What data is likely to play a role in a practical recommendation system? What kind of data can we get?

Generally speaking, there are four types of data: user behavior, item information, user profile and external data.

1. User behavior

User behavior data is the most important, and almost no recommendation system can directly indicate that user behavior data is not needed. On the one hand, user behavior data is an important data source in the training model, on the other hand, technical colleagues can know how the recommendation system is doing through user behavior feedback. One of the secrets of building a recommendation system is to accumulate user behavior data. If important user behaviors are not collected, for example, in e-commerce scenarios, if only the final order data is recorded, then there is still a certain distance from the data requirements of the recommendation system.

two。 Item information

Item information refers to the information that can be collected in the recommendation system to describe each content. Take the e-commerce scene as an example, when entering a specific item, entering the brand, price, category, shelf time and so on is the item information we want to collect. Suppose that in the e-commerce scenario, if you do not know the brand of each product, it is impossible to extract which brand a product belongs to from the description information of some objects, then the effect of recommendation is naturally limited. When the collection of item information is rich enough, it will be helpful to the effect of the recommendation system.

3. User portrait

In the traditional way of thinking, it is believed that the tags stored in the user portraits are actually the user's tags, but in many actual scenarios, the number of tags is small and the dimensions are thick, so they may not have the ability to tag users at all. This traditional idea of "tagging" will limit the idea of building a recommendation system.

From the perspective of deep learning, what is stored in the user's portrait is not the commonly understood "label". He may store the vector of this person. Deep learning is to vectorize people and objects, but this vector is incomprehensible. That is, we may not know what this vector means, when we see the corresponding vector of a user. We don't know if he is interested in sports, music or entertainment, but we can still recommend what he is interested in through vectors.

4. External data

Some people will blindly believe in external data and feel that they do not have enough data, so they must buy external data from Ali or Tencent to enrich user portraits, so as to improve the effectiveness of the recommendation system. Some people even think that the recommendation system does not work well because there is no external data.

But in fact, the effect of external data on the recommendation system, I think it also needs a very careful reasoning and verification.

First, verify how much your user base can overlap with the external data you buy. If a game platform buys Ali's external data, and such external data may only tell you whether users like to buy clothes, cars or electronic products, is such information useful for the game platform?

Assuming that the purchased external data happens to hit the business scenario, it may play a role, but in fact, it is not common to be able to hit both the user group and the tag.

We should not think that the above four kinds of data are easier to understand, so it will be relatively easy to obtain. In fact, when our Shenze team and I go to build an actual recommendation system, the place that consumes our manpower is often not the algorithm, but how to get the correct data. Next, we take user behavior data as an example. Share with you how to get the user behavior data we need?

At this time, we have to think, when we want to obtain user behavior data, what kind of role do we want user behavior data to bring to us?

I would like to sum up the following aspects:

First, we hope that user behavior data can be used to train the model, which is a very important aspect. For example, if I recommend ten items to a user, and two of them have clicked behavior, the model will think that these two data are positive examples and the others are negative examples. Therefore, we need user behavior data as the training data of the model.

Second, we hope that the user behavior data can verify the effect. After the recommendation system is online, it needs user behavior data to feedback how the recommendation is doing. For example, the increase in the click rate shows that the effect is better, the click rate decreases, the negative feedback becomes more, and the user is lost, indicating that there may be something wrong with the recommendation system.

Third, we hope that the user behavior data will support us to see the Amax B Test effect. The model launch must be based on Aamp B Test. We need to know how effective this launch is compared to the previous recommendation algorithm and recommendation system. In this way, we can judge whether this iteration is valid, if it is valid, it will be full, if not, then further iteration.

Fourth, we hope it can help us analyze the problem. After we put the recommendation system online, we may encounter some annoying problems, such as the click-through rate has not changed, or even the effect has become worse. After all, it is impossible to increase the effect of each iteration. So we hope that the behavior data can locate the reason why the recommendation system does not work well after it is launched. If the effect is good after being launched, we hope that the behavioral data can analyze what factors make the effect better.

So how do we get behavioral data that meets our needs? Take the first field exp_id in the exposure log as an example. Exp_id means experimental ID in Chinese.

It was mentioned earlier that we want the user behavior data to support Amax B Test, so how do we know which set of experiments each piece of data comes from? At this point, we need an exp_id field to record which group of experiments each exposure log comes from. When we analyze the Test effect of A Test B again, we can distinguish the exposure and clicks caused by different experiments according to an exp_id field.

In the exposure log, we often talk about how to design some common fields, and another specific question is-- how do we collect this data?

To put it simply, when a user has some user behavior in the product, how can the data eventually fall into the server's log for model training and effect analysis?

There are usually two ways to collect user behavior. The first is the self-help burial point, in which the client first records the user's behavior, and then passes it to the server, which then passes it to the recommendation engine. Another burying point is the SDK burying point, we directly use SDK to do the recommendation engine burying point.

SDK burial sites have two advantages:

First, the access cost of SDK burial point is low, and it has a relatively mature burial point event and verification scheme. In addition, SDK has a burial point interface and documentation to guide customers to bury it, so there is no need to pay attention to the reporting problem.

Second, the fault tolerance of SDK burial point is relatively high. If it is a self-service burial point, from the client to the recommendation engine through the server, data problems, it is difficult to trace the burial point problems, transmission problems, high data quality maintenance costs, SDK burial points will be relatively convenient.

So how do you train the model when you have behavioral data? There are usually the following steps:

First, construct positive and negative examples. For example, to recommend ten items to users, there are a few clicks, there are several positive examples, other clicks do not occur is a negative example.

Second, structural feature engineering. Later, we will take an e-commerce scenario as an example to explain how to construct feature engineering in general.

Third, data sampling. Data sampling has a great influence on the training effect of the whole model.

Taking the e-commerce scenario as an example, the following explains how to do feature engineering, which is mainly divided into two aspects:

First, the commodity dimension. In the dimension of goods, we may pay attention to the category, brand, price, gender of some goods, as well as some data from various user behavior feedback, such as click rate, collection ratio, and so on. On the one hand, these contents reflect some attributes of the product itself, but also reflect the quality of the product.

Second, at the user level, the age and gender of the user are usually considered first. Because in the field of e-commerce, there is a big difference between the goods preferred by men and women. In addition, there are user's category preference, brand preference, and price preference and so on.

In terms of data, I would like to share with you the "pit" I have encountered in my actual work:

After a small amount of traffic went online, the team members and I found that the effect was not as good as expected. According to past practical experience, it should not be such a bad result. When we went to analyze the data, we found that there were two data anomalies.

First, fewer users hit the behavior model. In general, as long as it is not a new user, theoretically, it should be able to hit my behavior model. At that time, the proportion of new users was less than 20%, while only about 30% of the users hit the model, indicating that a large number of users did not hit the model.

Second, many requested ID do not appear in the log. At that time, we wondered whether our recommendation had been erased by others' cheating. Because the use of "cheating" can well explain why these requests do not fall into the log.

But in the end, we found that it was not the problem of cheating, but because the user ID was not unified. The front end is using a set of user ID systems they understand to log, but the back end is using another set of user ID systems to send requests. So all the data can not be matched, the request from the back end is always a new user, and the trained model can not hit any user.

Finally, we have established a series of methods, tools and processes to ensure the consistency of the entire user ID system.

The above is the sharing of the first two points of "algorithm" and "data" of Shenze data algorithm expert Hu Shiwen on the practice and thinking of the recommendation system. Due to space constraints, "online service" and "effect evaluation" will be introduced in the next article. I hope it's helpful to you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.