Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does Python capture the evaluation of JD.com Mall

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how Python captures the evaluation of JD.com Mall". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "Python how to capture JD.com Mall Evaluation".

Distributed capture of the evaluation information of JD.com Mall

The purpose of using distributed crawling is to quickly grab as many commodity evaluations as possible in a short time, so as to make the analysis results more accurate.

Find out the URL rule of evaluation request, and get the following URL combination link

Using the Chrome plug-in Postman to test whether the link is available, it is found that JD.com 's access to evaluation information does not verify anti-crawling measures such as Cookie.

Start coding and use scrapy to capture the commodity evaluation information of JD.com Mall and store it in the database for use.

Data analysis

Extract the corresponding data from the database and begin to analyze

Use python's extended library wordcloud to extract good, medium and bad keywords respectively, and generate corresponding word cloud pictures.

Analyze the proportion of sales of different colors of the product, and generate bar charts, such as the proportion of iphone7 in different colors gold, rose gold, silver, black, bright black, and red.

Analyze the percentage of sales in different configurations of the product, and generate bar charts, such as iphone7 32G, 64G, 128G storage

Analyze the sales and comment time of the product and generate a line chart to find out when the product is the best seller.

Analyze the channels through which users buy the product, such as JD.com Android client, Wechat JD.com Shopping, JD.com iPhone client, and generate a bar chart.

Analyze the geographical province of the user who bought the product. For example, more people in Beijing, Shanghai and Guangzhou buy iPhone7 on JD.com.

Store and retain all the above analysis results

Django background WEB

Use Django to build a simple background jd_analysis to connect distributed capture data with data analysis, and return the analysis results to the front end for display.

Jd_analysis provides an URL link to accept user requests to analyze JD.com Mall merchandise.

After receiving the link to the product, jd_analysis starts the crawler process and starts to grab the name and evaluation quantity of the product to be analyzed.

Assemble a complete evaluation link and insert it into redis to achieve distributed crawler crawling, and grab as much evaluation information as possible in a short time (I can grab about 3000 evaluation messages in 30 seconds now)

The master server waits for a certain crawl time. For example, the master server must return the analysis result to the front end after waiting for 30s, so after 30s, clear the link of the product in redis, and the slave server will automatically close the link that cannot be crawled.

Start the analysis process, start analyzing all the captured data, and generate icons and other information

Front-end display

When the client requests *, a GUID is generated and stored in the cookie. Then start a timer and take GUID to constantly request the results from the jd_analysis background. The jd_analysis background uses the requested GUID to get the crawling information and all the contents of the analysis result from the redis and returns it to the front end. The front end displays the requested result.

* two effect pictures are attached.

Purchase and comment time line chart

Purchase channel bar chart

Thank you for your reading, the above is the content of "how Python grabs JD.com Mall Evaluation". After the study of this article, I believe you have a deeper understanding of how Python crawls JD.com Mall evaluation, and the specific use still needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report