Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to crawl food web information with python code

2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how python code crawls food network information for everyone, the quality of the article content is high, so Xiaobian shares it for everyone to make a reference, I hope you have a certain understanding of related knowledge after reading this article.

There's a pit ahead of the three-step data crawl.

Work needs to collect food data from OTA websites, hotel types in a certain city, etc. It's not a big deal for gluttons... However, the final result is that there is no time to eat lunch and dinner... as follows

Chrome F12 directly locates the get request, the response result is json, study the get parameter and find a strange parameter token?!

Let's ignore him and directly modify the parameters and turn the page to request data!!!

Start of the three-step data crawl

Here's the problem! After struggling for a long time, I found that this token is time-limited, and it is generated by js... That's not a problem, get requests don't work and we still have selenuim. Sadly, Meituan is really a big factory that directly blocks selenuim

Data Crawl Three Steps to Fill the Pit

Back to square one. There is no way to start from token. After some searching, I found a JS file

Um... Well, continue, because before did not use python directly call js, Baidu found pyexecjs, PyV8 and so on can be. Sadly, my Python 2.7 has been unable to work properly after installing Pyexecjs, PyV8 has no problem. PyV8 installation process is too sad

Cut the crap and go straight to the code:

I save js files to local python directly using PyV8 to directly parse js events that execute tokens

The program automatically generates tokens, and can't wait to continue parsing json data into the database.

After the test is completed, grab Beijing and Shanghai data for data visualization

The statisticians found that Meituan still limited the data to a maximum of 32 per page for each type of restaurant. That's 32*32=1024.

data visualization

Proportion of food types in Beijing and Shanghai

See to see Sichuan Xiang, barbecue and western food in the two places are the largest number of proportion. Lu Chuan, Ma Xiao really did not distinguish between north and south.

There's the number of reviews per store, and we can analyze that to show popularity, and because of that, we're only showing top 10.

Beijing and Shanghai City Top 10 Cuisine

Beijing and Shanghai hot pot topped the list

Below we compare the average price of similar food in the two places:

The consumption level of Shanghai has already exceeded that of the Imperial Capital... hahaha

About how python code crawls food network information to share here, hope the above content can have some help to everyone, can learn more knowledge. If you think the article is good, you can share it so that more people can see it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report