In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
How to carry out Baidu index mutation detection, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
Baidu Index is a very useful tool, through which we can know the hot trend of certain keywords in the past few days and be able to analyze these data. If we can use the Baidu index well, we will be able to produce great value. You can get the complete source code of this project by following the official account of the Python utility code at the bottom of the article and replying to the sudden changes in the Baidu index.
The main purpose of today is to teach you how to find out the position of the abrupt value in the Baidu index, as shown in the picture.
It is easy to find the location of abrupt data manually in a 30-day data stream, but what if it is 180 days? It's not easy to find it manually:
How to use Python to automatically find the mutation points in these 180 days? Because it involves the detection of the mutation point of the time series, we can use a kind of mutation point detection algorithm called Pettitt.
1. Get data
Find the data interface through the developer tool and find that the data returned by the interface is encrypted:
It looks a lot like string substitution, and if you want to decrypt it from scratch, you need to do some comparison (compare the source data with the encrypted data) or look directly at the front-end source code. Since this is not what I'm going to talk about today, I directly used other people's open source projects and made some changes, Baidu Index Crawler:
Https://github.com/longxiaofei/spider-BaiduIndex/tree/master/new_spider_without_selenium
You can get all the source code of this article by following the official account at the bottom of the article (Python practical treasure book) and replying to the Baidu index mutation point.
Call the API to crawl the data, and then store the crawled data into the array according to keywords. You can easily modify my code to add / subtract keywords. Here I choose only one keyword in the block chain for analysis in order to simplify the problem. The code is as follows:
The results are as follows:
two。 Mutation point algorithm
Pettitt mutation point detection algorithm is written in R language, the implementation is actually very simple. The author does not say why, but gives the corresponding mathematical formula, and we try to follow the author's train of thought to see how it works.
The algorithm code is as follows:
Next, we need to pass the data into the function to get the mutation point (one) of the data. Because it can only find one mutation point in a piece of data, and we need to obtain multiple mutation points, we also have to set up a mobile window to get the mutation position in each window.
3. Set the window to get the mutation position of each window
Set the data to a window for 30 days, and detect the mutation value in each window:
The results are as follows:
It's really not good to see the effect this way, so let's visualize it with matplotlib:
Results:
To tell you the truth, I am not satisfied with this result, but two mutation points have not been found, of which the one on the right is actually more important. Apart from these two mutation points, on the whole, the effect of this detection method is OK.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.