Shenze data algorithm expert: practice and thinking of recommendation system (part two) 04/15 Update SLTechnology News&Howtos

Shenze data algorithm expert: practice and thinking of recommendation system (part two)

2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Online service of recommendation system

After solving the problems at the algorithm and data level, we need to build an online service of the recommendation system to respond to users' recommendation requests. Suppose that the initial DAU of the enterprise is 100000, and when the DAU rises to 1 million, it is hoped that the problem of service performance response can be solved by adding machines. If we have to restructure the recommendation service every time DAU expands, it will be too expensive, so we want our recommendation service to be highly scalable.

Another common requirement is: how to query and calculate high-dimensional vectors? How to meet the needs of different scenarios with different timeliness? How to monitor and call the police...

Although the model of deep learning is complex and effective, even after the model is trained, there will still be great challenges in the transition to online services. Today, I would like to share with you some practical issues.

1. How to do high-dimensional vector query?

For instance. Suppose there are 100000 products at present, and if each product has one vector, there will be 100000 vectors. When the user arrives, each user corresponds to a vector. At this time, we need to find the 100000 or 500 vectors that best match this user vector. At the same time, it is also necessary to ensure that the vector will be found in a short enough time (10-20 milliseconds), so there is still a big challenge in terms of response time.

Our solution is to use a tool called Faiss, which can solve the large-scale vector similarity search problem, and can support up to 1 billion content space. In short, when we have 1 billion items, we can still use this component to do vector-based similarity search.

two。 How to make the online service of the recommendation system highly scalable?

I have always stressed that I hope our scalability is horizontal, and after the traffic comes up, the pressure on the service can be solved only by adding machines. Our idea is to divide online services into three groups: online storage, online service group and model service group.

We do some logical decoupling between the model service and the online service, so as to ensure that the whole architecture is scalable, so that we can add both the model service and the online service to solve the pressure on the server.

3. How to support different timeliness in different scenarios?

As a technical person, I often receive the following requirements from the product manager when doing Feed streaming. For example, I need to recommend articles for the last 3 days and videos for the last 7 days on the integrated channel; the data volume of the history channel is not so demanding for timeliness, and I need to recommend articles for nearly 30 days and videos for the last 60 days; articles for the last 7 days are required in related articles, and videos for nearly 30 days are recommended in related videos.

Strictly speaking, these requirements are very reasonable, because it is based on the product itself and users' demands for this product, but these requirements will actually bring great problems to the recommendation system.

Let's simply calculate the number of scenes.

The product manager needs us to support two types of article recommendation and video recommendation, and at the same time, they should be divided into different channels, while the content and scope of integrated channels and other small channels are not the same. There are as few as a dozen small channels, matching the two architecture types. About 2 × 10 million 20 pieces of data, coupled with the recommendations of relevant documents, may generate 40 data.

In order to support different timeliness, I need to maintain 40 different sets of data, and maintaining 40 sets of data in the recommendation system means that the maintenance cost and the risk of error are considerable. 40 sets of data, which may have 40 pieces of logic and 40 data streams, will be a nightmare for the people who take over.

So in the overall architecture, we will design a complete set of tools and processes for different scenarios and different timeliness to solve such problems, which can make our online management more concise, error-free but very flexible, and can be easily added even when there are other timeliness requirements.

Evaluation of the effect of recommendation system

Evaluating a recommendation system will involve some common indicators: click rate, click ratio, per capita click times, retention rate, conversion rate and so on.

1. Number of clicks ratio

Refers to the number of clicks divided by the number of recommended exposures, which is an important indicator used to measure the access rate of the recommendation system. When evaluating the effect of a model, the click-through rate may increase, but the click-to-click ratio does not change, which shows that the recommendation result only has a good effect on some old users, and for those users who can not reach it, it is still not successful to attract them to use our recommendation system, so the click-to-click ratio and click-through rate are the evaluation of the recommendation system in different aspects.

two。 Number of clicks per capita

Refers to the average number of clicks per person per day in the recommendation system. The number of clicks per capita is an indicator that needs to be paid attention to continuously, because this indicator really reflects the depth of user use in this product.

3. Retention rate and conversion rate

In fact, the retention rate and conversion rate may not be such a direct indicator for the recommendation system, for example, the impact of recommendation on retention depends to a large extent on different product forms, but it is still an indicator for us to evaluate the recommendation system. At least we need to know how much impact this iteration of the recommendation system has on the retention rate, if the retention rate decreases after iteration. Even if the click-through rate and click-to-click ratio are on the rise, this iteration may not go online because it affects the remaining metrics.

There are also some aspects, in fact, in the previous article, Shenze data VP Zhang Tao: personalized recommendation from entry to proficiency (with recommended product manager training secret book) has been mentioned to you.

Timeliness. If we are working on a recommendation system for news products, then the content recommended to users should be real-time, not what happened last week.

Diversity. Diversity is an indicator that is easy to overlook, because click-through data will look better if you don't pursue diversity.

I do not know if you have such an experience, if you are interested in sports content, slowly all your recommendations become sports-related, it seems difficult to see other content, the recommended content is getting narrower and narrower. In the short term, increasing diversity may lead to some loss in click rates, but in the long run, diversity is an optimization to improve the user experience of the entire product, and long-term and short-term trade-offs need to be considered.

Stability. If the server often hangs up, or the response time is always five seconds, such a service is basically unavailable, and we must evaluate our recommendation system from a service point of view.

Coverage. Coverage refers to being able to recommend enough long-tail content, a UGC platform, need to encourage some users to let them generate content, even if some small users, even if there are no fans, I also hope that their content can have some exposure, there will be people to like, over time will form a virtuous circle.

If the platform always distributes some big V content, the use and experience of novice users in the platform will become very bad, slowly there will be no these small content windows, and the platform will be occupied by big V. so coverage is also an indicator to be considered by the recommendation system.

As for the specific indicators to consider, and how to develop these indicators, I think it should be based on different product forms and different stages of the product.

So in the face of these indicators, do we have powerful analysis tools to support me to do this? For example, when I want to compare the conversion rate of the recommendation system with that of another banner, does our analysis tool have this ability?

In my daily work, I do the whole conversion funnel analysis and retention analysis according to the magic strategy analysis. Retention analysis is actually a more complex analysis method, it emphasizes more dimensions, it may have to analyze users' retention behavior from various time periods and conditions.

If you want to analyze the impact of the recommendation effect on retention, you can do the retention rate analysis directly in the magic policy analysis.

In addition, I would like to share with you some ideas about iteration.

As an example, we analyze the different performance of the recommendation system on the new users on December 18.

We want to know how this iteration of the recommendation system performs for both new and existing users.

As can be seen from the picture, there was a significant improvement for new users the next day, but not for old users. It shows that the launch of the model is good for new users, and we need to further analyze why the effect is significantly improved for new users but not for old users.

It may be because the data sampling method used is more beneficial to new users, or because the feedback on the characteristics of new users is more timely, and some inappropriate processing methods are done for some long-term characteristics of old users, and so on.

Therefore, to implement a useful recommendation system, we may face these challenges:

First, the quality of data acquisition and processing, which I mentioned earlier, is how to do data collection and how to do feature engineering.

Second, combine the algorithm with the business, how to deeply understand the business scenario, and how to choose the appropriate algorithm.

Third, build a recommendation system and evaluation system, and how to solve the challenges of online services.

Fourth, cost control, when we build algorithms, data, online services and evaluation methods from scratch, it will cost a lot of manpower and time.

Finally, I would like to answer a question that you asked me at the beginning of the speech-- how to build a closed loop and systematize it. In fact, it is the core advantage of Shenze intelligent recommendation system-the whole process, real-time, fast iterative recommendation closed loop.

Through my sharing, you can also see that when we actually build a recommendation system, we will encounter a variety of problems, based on previous experience, data quality is a very important part of attention. It includes full-end data collection, data processing and modeling, label system and user profile establishment.

Then, when we have the data, we go to build the algorithm, we have rich experience in algorithm modeling, and the data is based on magic strategy analysis, with real-time data feedback and fast modeling ability.

After the algorithm takes effect, we will conduct a multi-dimensional verification analysis of the results, on the one hand, we should have a understanding of the effect of this recommendation, on the other hand, we should understand how to improve it in the future. At the same time, there are two more important aspects of the solution we provide.

First, Shenze data is a company that supports privatization deployment, so Shenze Intelligent recommendation system also supports privatization deployment, and the whole system is deployed on the customer's own server layer.

Second, it is open. Customers can call all kinds of intermediate data and interfaces themselves, for example, we help customers collect behavior data, various user portraits and model results generated during the construction of the entire recommendation system, and some results of content analysis. there are also some modeling methods generated in various stages, which can be transferred by customers. The solution of Shenze data is an open white box, from experimental design, to data collection, to intermediate feature engineering, to model construction, to the final recommended results, the data and interfaces are available for customers to access and view.

Finally, I would like to emphasize two points:

First, the recommendation system is not just an algorithm, it is a systematic project, and the algorithm accounts for only one of the four parts. usually, when we implement a recommendation system, the time to build the algorithm is usually only 20% to 30%.

Second, data comes first, and data is the premise of all algorithms. according to past experience, most of the time, it is not because of the problem of the model or the service, but because the data is not done right. as a result, the effect of our recommendation system is not as expected.

The above are some thoughts on the recommendation system that I have summed up from many years of work experience and practice, hoping to inspire your work.

For more practical information and cases on the Internet, please follow the official account of [Shenze data]. Reply keywords can also enter the exchange group, get reports, industry cases and other benefits.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.