Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to detect the sudden change points of Baidu Index

2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

How to carry out Baidu index mutation detection, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

Baidu Index is a very useful tool, through which we can know the hot trend of certain keywords in the past few days and be able to analyze these data. If we can use the Baidu index well, we will be able to produce great value. You can get the complete source code of this project by following the official account of the Python utility code at the bottom of the article and replying to the sudden changes in the Baidu index.

The main purpose of today is to teach you how to find out the position of the abrupt value in the Baidu index, as shown in the picture.

It is easy to find the location of abrupt data manually in a 30-day data stream, but what if it is 180 days? It's not easy to find it manually:

How to use Python to automatically find the mutation points in these 180 days? Because it involves the detection of the mutation point of the time series, we can use a kind of mutation point detection algorithm called Pettitt.

1. Get data

Find the data interface through the developer tool and find that the data returned by the interface is encrypted:

It looks a lot like string substitution, and if you want to decrypt it from scratch, you need to do some comparison (compare the source data with the encrypted data) or look directly at the front-end source code. Since this is not what I'm going to talk about today, I directly used other people's open source projects and made some changes, Baidu Index Crawler:

Https://github.com/longxiaofei/spider-BaiduIndex/tree/master/new_spider_without_selenium

You can get all the source code of this article by following the official account at the bottom of the article (Python practical treasure book) and replying to the Baidu index mutation point.

Call the API to crawl the data, and then store the crawled data into the array according to keywords. You can easily modify my code to add / subtract keywords. Here I choose only one keyword in the block chain for analysis in order to simplify the problem. The code is as follows:

The results are as follows:

two。 Mutation point algorithm

Pettitt mutation point detection algorithm is written in R language, the implementation is actually very simple. The author does not say why, but gives the corresponding mathematical formula, and we try to follow the author's train of thought to see how it works.

The algorithm code is as follows:

Next, we need to pass the data into the function to get the mutation point (one) of the data. Because it can only find one mutation point in a piece of data, and we need to obtain multiple mutation points, we also have to set up a mobile window to get the mutation position in each window.

3. Set the window to get the mutation position of each window

Set the data to a window for 30 days, and detect the mutation value in each window:

The results are as follows:

It's really not good to see the effect this way, so let's visualize it with matplotlib:

Results:

To tell you the truth, I am not satisfied with this result, but two mutation points have not been found, of which the one on the right is actually more important. Apart from these two mutation points, on the whole, the effect of this detection method is OK.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report