In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Today, I will talk to you about how to use Python web crawler to obtain tourist attractions information, many people may not know much about it. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.
Crawler series:
When we travel, we will look at the tourist attractions in this place, scenic spot prices, opening hours, user comments and so on.
1. Project objectives
Get the site's scenic spot name, opening hours, wonderful reviews, price and other information.
2. Libraries and websites involved
List the URL first, as follows:
Web site: https://go.hao123.com/ticket?city=%E5%B9%BF%E5%B7%9E&theme=all&pn=1
The URL city=%E5%B9%BF%E5%B7%9E refers to the city of Guangzhou and pn refers to the number of pages.
Required libraries: requests, lxml, pprint
3. Import the library import requestsfrom lxml import etreefrom pprint import pprint we need for concrete implementation
After importing the library, we define a class class, then define an init method inheriting self, define a main function main, and define an init method: first prepare the url address, headers, as shown in the following figure.
Define a request function to get the response data function:
After the data is requested, we need to parse the data: get the secondary page link of the scenic spot name: use xpath to find the link path using Google browser to select the developer tool or press F12, select Elements and press the number 1 and 2 to find the secondary page link of the tourist attraction name. According to the analysis, we can remove the code. After getting the secondary page link, send a request to get the response and parse the data. Define a dictionary, save the name of the scenic spot, opening hours, wonderful reviews, price. Use a judgment statement to determine whether the content is empty. Finally, define a main function, as shown in the following figure. 4. Effect display
Click the green button to run and display the results on the console, as shown in the following figure. Enter the number of pages you want to climb.
5. Summary
It is not recommended to grab too much data, it is easy to load the server, just dabble.
I hope that through this project, we can help you understand the tourist attractions better.
Welcome to actively try, sometimes see others to achieve is very simple, but to do it yourself, there will always be a variety of problems, do not aim high and low, do it diligently, you can understand more deeply.
After reading the above, do you have any further understanding of how to use Python web crawler to obtain tourist attractions information? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.