In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
Every enterprise has a lot of data, but whether the data can be converted into commercial value is an issue of great concern to enterprises. Alibaba once mocked himself that he was an enterprise eating steamed bread on a gold mine of data. a few years ago, the group accumulated a lot of data, but the data was not really applied and was limited by several reasons. For example, big data's technical framework is not yet mature, and the operation team is not very aware of the application of data, but today, the scope of data application in Alibaba has become more and more extensive.
Cdn.com/72277701ce190fcba2f534c79608d0b36a8e7734.png ">
According to the speech given by Yuan Chuo, a senior technical expert of Alibaba, on the mobile R & D platform EMAS of Hangzhou Station of the 2018 Yunqi Conference, this paper introduces the construction of the intelligent operation system facing the mobile Internet era, which is mainly divided into three parts: first, the mission of intelligent operation and typical application scenarios; second, the architecture of personalized recommendation system; third, the application of AB in intelligent operation system.
I. the mission and typical application scenarios of intelligent operation
To measure whether an intelligent operation system is good or not, the goal is very clear, that is, whether it can help enterprises achieve data growth, because growth is the core demand of enterprises.
In order to realize the enterprise intelligent operation, we must first carry on the construction of the closed loop of data operation. Traditional BI collects data, produces reports for the boss, and lets the boss make decisions, but the most important thing for the intelligent operation system is to apply the data to the actual business scenarios to form a data closed loop. Collect data, through the training of the model into the prediction ability of the system, apply to the actual business scenarios, and finally feed back the user data to our system. After several rounds of iteration, the prediction ability of the whole system will become stronger and stronger.
Enterprises want to improve business results, and the improvement of business results depends on the recognition of our users on the platform. The business statistics module of EMAS can undertake the work of data collection and understand the user's behavior. The role of machine intelligence is to transform the user's behavior data into the operation action of the enterprise.
The specific process can be divided into several parts: first, based on the original data, take the newcomer as an example, mark the user for the first time according to the user's click on the hot data in the cold start phase. We generally identify what type the user belongs to. Secondly, we make a tentative push, such as information or product, and the user will click accordingly according to the information or product I push. After several interactions, the machine will have a deeper understanding of the user. Finally, after many interactions between users and the platform, enterprises cooperate with the corresponding operational strategies, such as sales promotion, the transformation effect will be significantly improved, which is the basic process of the intelligent operation system.
Our understanding of the whole life cycle of users is from new customers to regular customers and old customers to help you spread this whole stage, the time period is still relatively long. For a new user, you directly push the message that you want him to place an order to him, and the effect is often not very good. Therefore, it is necessary to make some detailed analysis of the whole life stage of the user.
Three typical application scenarios for intelligent operations:
First, a thousand people have thousands of noodles. Taobi also did the work related to recommendation in the era of PC, but the effect was not good. However, after the wireless era, the effect of personalized recommendation has been improved obviously, which is due to the great changes in user behavior. Aimless, fragmented, anytime, anywhere. Whether we can make full use of the broken floral time given to us by our users, so that our consumers are suddenly interested in our products, enterprises need to have a very deep understanding and insight into the users.
Second, precision marketing. Before the marketing campaign, analyze the audience, the specific pricing strategy, and the sales forecast under such a pricing strategy, so that enterprises can know the completion of KPI in advance.
Third, intelligent selection. What we talked about earlier is more about how the product interacts with users more. The scenario in which smart selection applies is that we have knowledge of the target audience and hope to reach those users that we did not reach before. If supermarkets want to attract young people, they need to adjust their product structure to attract young users back. Box horse, Taobao heart selection, is Ali to do a better case.
II. Architecture of personalized recommendation system
Next, I would like to introduce you to the personalized recommendation system. Personalized recommendation has a lot of precipitation in Alibaba Group in recent years. Take Mobile Taobao's home page as an example, many places have been personalized, such as the portal map, each APP has a sub-channel, and most of the sub-channel entry map is a static map made by the designer. If you use the data of the sub-channel to make a personalized match with the user, and do the entry chart with thousands of people, the transformation of the entry click will be greatly improved.
What do you need to pay attention to for a good personalized recommendation:
First, the project is realized. Personalized recommendation, the traditional implementation method, is to calculate a list of recommendations to the user at a certain point in time, and refresh the data every day. What's the problem with doing this? The amount of data of users has been growing, the corresponding storage costs will also increase, and the input cost of enterprises will be very large. Therefore, when designing the system, we need to consider the ability to use tags. In addition, each person's order of goods corresponding to the label should be different, we need to add a second sort, to ensure that everyone's recommendation list, although the goods are the same, but the order is different.
Second, real-time recommendation. Offline recommendation is mainly based on historical data, while real-time recommendation is based on the same day's data. When you make a recommendation to users on the same day, the conversion rate is often the highest. But what is the challenge for us? First, we must have the ability of real-time computing, because users give us very little time, if you delay five minutes, basically lose users. Second, from an algorithm point of view, it is necessary to strike a balance between whether you are based on historical recommendation data or real-time data of the day, and which conversion rate is the highest.
Third, time and space. Take e-commerce, down jackets or clothes have seasonal attributes, down jackets are suitable for winter wear, electronic products have new and old models, judge that a user has only bought new models, you should recommend the new models to him. In addition, push has a time attenuation effect and cannot push the same goods all the time. Time and space are the two dimensions that must be considered.
Fourth, discovery. When we make personalized recommendations, the model is basically optimized with a specific goal, but what is the problem here? There will be a serious Matthew effect: first, my recommendation depends on my historical data. Why are you pushing your clothes? Because you always look at clothes, the model judges that the transformation of pushing clothes must be the highest. I recommend it, and then you click again, which produces another piece of historical data. I find that the effect is really good, so what will the model push next time? Must still push your clothes. But in fact, everyone has a wide range of interests and hobbies. The categories I give you are getting narrower and narrower, and finally I find that your behavior is getting narrower and narrower, which does not match the actual characteristics of people. We need to expand the width of the category in the recommendation system. Second, what kind of product has the highest conversion rate? Must be popular style, whether in the financial industry or other industries, the conversion rate of popular style is the highest, the model determines that the conversion of popular style is higher than that of general products, what is the result? The range of products recommended by the system is getting narrower and narrower, which is a very serious problem. That is to say, the categories recommended to users are getting narrower and narrower, and the product range is getting narrower and narrower. So in the whole process of the model, to try to recommend something that may not exist in the original historical records, to make some tentative findings, this is very meaningful, otherwise it will be good for short-term returns, but it will have an impact on long-term returns. So the conversion rate is very important, but the discovery is more important, the category expansion will make your business volume bigger and bigger, the product is the same, there must be a new product after the popular style, the new product also needs to become the popular style.
Fifth, dirty data. Dirty data is generally divided into two categories, the first is invalid data, such as "Singles Day", because their behavior on that day is very special. "Singles Day" bought things that you might not normally buy. This kind of data is not very helpful to daily recommendations, and it must be removed. The second type of data is cheating data. The amount of data such as brushing credit and integral is often very large, and if such data are not removed, the deviation between the final predicted result and your original real value will be very large.
Finally, I would like to introduce the system architecture recommended by Alibaba in real time, which will probably be divided into several parts, including EMAS data statistics module, collecting data, processing and training the data after getting the data, forming a model and applying the data to the production environment. The production environment is generally stored in the graph database because it is a mesh structure, and finally there is a very simple API that can simply call the data. There is a very important part of the system, that is, in the process of model training, we must have the input to support industry experience, because in the course of practice, we have found that today's general model to superimpose some industry rules, its effect is very good. Because each industry has its own particularity, it is unrealistic for a set of general algorithms to be applied to all industries today. This is a simple system architecture diagram of our personalized recommendation system. It must be a closed loop, and the data must be transferred, because if the data is not transferred, we do not know whether the result of my recommendation is accurate or whether our insight into the user is accurate. We must ensure that after the data has been running for a period of time, the data will rise as a whole.
Third, the application of AB in intelligent operation
Finally, let's talk about the application of AB testing in intelligent operation. As we all know, the development of algorithms today is very fast, such as deep learning is very popular in previous years, reinforcement learning in recent years, some new algorithms are developing rapidly, we need to apply new algorithms in the process of model iteration. But generally speaking, we may not be able to confirm which algorithm works better. I do a lot of tests offline today, but I still have to do experiments in the production environment. We can do barrel testing, benchmark and test buckets, test buckets we use one model, benchmark buckets use another model, compare the effects of the two models. In fact, in the process of application, before we do the AB test, we must do the AA test to ensure that the data of the two buckets are exactly the same before the experiment. At this time, you change the model of one bucket, and the data is reliable.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.