In addition to Weibo, there is also WeChat
Please pay attention

WeChat public account
Shulou
 
            
                     
                
2025-10-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces how python crawls the data of Meituan's 1024 barbecue restaurants. It is very detailed and has a certain reference value. Interested friends must read it!
Analyze the real URL https://apimobile.meituan.com/group/v4/poi/pcsearch/30?uuid= your & userid=-1&limit=32&offset=32&cateId=-1&q=%E7%83%A4%E8%82%89
Main parameters:
30: city id (30 represents Shenzhen)
Limit: number of stores per page
Offset: page turning parameter (each additional 32 page turns)
Q: keyword (barbecue in this case)
Only 1024 store data can be obtained by crawling according to the above API. In order to obtain more comprehensive data, you also need to find the areaId parameter (sub-region), and then traverse the sub-region to get the complete data. Limited to space, only the core code is given.
Def get_meituan (): try: for areaId in areaId_list: for x in range (0, 2000, 32): time.sleep (random.uniform (2)) # set sleep time print ('extracting'% areaId with% d areadId 'page d'% int ((xcrawl 32) / 32) # print crawl progress url = 'https://apimobile.meituan.com/group/v4/poi/pcsearch/30?uuid= your & userid=-1&limit=32&offset= {0} & cateId=-1&q=%E7%83%A4%E8%82%89&areaId= {1}' .format (x AreaId) print (url) headers = {'Accept':' * / *', 'Accept-Encoding':' gzip, deflate, br', 'Accept-Language':' zh-CN,zh Qcow 0.9, 'Connection':' keep-alive', 'Cookie':' your', 'User-Agent':' Mozilla/5.0 (Macintosh Intel Mac OS X 10 / 14 / 6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36', 'Host':' apimobile.meituan.com', 'Origin':' https://sr.meituan.com', 'Referer':' https://sr.meituan.com/s/%E7%83%A4%E8%82%89/'} response = requests.get (url, headers=headers) print (response.status_code) data processing
More than 20,000 barbecue restaurant information was crawled down in just a few minutes. In order to facilitate visual analysis, it is also necessary to simply clean the crawled data.
Import data
Import data and add column names, and use sample () method to randomly select 5 sample data previews.
Import pandas as pdimport numpy as npdf = pd.read_csv ('/ Users/wangjia/Documents/ technology account / project / 2.spider/ Meituan / Shenzhen barbecue 1.csventing, names = ['shop name', 'shop address', 'per capita consumption', 'store rating', 'number of comments', 'business district', 'picture link', 'shop type' 'contact information']) df.sample (5)
 
Delete duplicate data df = df.drop_duplicates () missing value processing
As you can see from the above, only the contact information field contains the missing value and is filled with text.
Df = df.fillna ('no data yet') store address cleaning
Intercept the district and county through the store address field. In addition, "South Australia University" belongs to Longgang District and is directly replaced by the replace () method.
Df ['district and county'] = df ['store address'] .str [: 3] .str.replace ('South Australia University', 'Longgang District') store rating cleaning
According to Meituan's scoring method, the store scoring field is divided and the scoring type column is obtained.
Cut = lambda x: 'normal' if x
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about

The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r


A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from

Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope





 
             
            About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.