In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the relevant knowledge of "how to use Python to analyze National Day tourist attractions". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
I. objectives
Use Python to analyze which tourist attractions on National Day: places that are fun, cheap and less populated, otherwise you have to rush to take pictures!
Second, obtain data
Since to do data analysis must first get the data, at first Brother Pig looked for tourism information on some official websites, after all, the credibility of the official data is high, but I found nothing, a little disappointed!
Then look for other alternatives: crawl the ticket data of tourist attractions on the travel website, which can also reflect the popularity of tourist attractions!
Brother Pig first wants to go where, here must Amway go where, the same hotel the same room, where the price is basically the lowest, so Brother Pig also uses the most!
Choose the object of study, then I will begin!
Note ⚠️: this tutorial is for learning and communication only. If there is any infringement on the rights and interests of anyone, please contact Brother Pig to delete!
1. Crawl a single page of data
We can search the ticket page of where to go: * National Day tourist attractions * *, we can see some information about the recommended scenic spots, such as name, region, popularity, sales volume, price, grade, geographic information and so on. Conscience!
Then press F12 to open the browser debugging window and find the url that loads the data (you can see it by turning the page)
It is so convenient to return json data directly.
Finally, use the requests library to write a get request.
Is it very simple to capture such a page of data?
Here to say where the ticket page crawl data is still very simple, do not need to log in, do not need an agent, or even do not need header to succeed, after the batch crawl page does not appear restrictions, compared to Taobao, it is much easier!
two。 Extract valid information
Now that you have the data, look at the data structure and extract the attributes you want
Here Brother Pig extracted: id, name, star, score, ticket price, sales volume, region, coordinates, profile these information, the basic effective information is saved!
3. Save to excel
After the necessary data has been extracted, we can save them. Here we use the pandas library to save the excel file.
Students who have not installed the pandas library should install it.
Pip install xlrd pip install openpyxl pip install numpy pip install pandas
Here, the processing of single page data is completed, crawling, parsing, and saving are three steps.
4. Batch crawling
Batch crawling is also very simple, first find the paging data, a few more pages to compare the number of different parameters can be seen
After a while of analysis, we can see that the parameter page is the paging parameter, so we can write a for loop on the outer layer and pass in the number of pages to achieve batch crawling.
That 36 is what I saw on the web page, of course, you can also automatically judge whether the crawl is complete, as long as you judge the number of entries returned each time!
Look at the effect of batch crawling.
Third, analyze data
After downloading all the data, we have to think about how to use and analyze the data. Brother Pig simply made a few analyses:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Analysis on the ranking of ticket sales in Scenic spots
Analysis on the ranking of Scenic spot ticket sales
Analysis on the number of scenic spots at different levels in each province
Thermal Map Analysis of Scenic spot sales
Recommended scenic spot analysis
The visualization library we use is still the pyecharts library, and more dimensional analysis is waiting for you to think.
1. Analysis on the ranking of ticket sales in Scenic spots
Let's first analyze the ranking of ticket sales of scenic spots.
We created a PivotTable and sorted it by sales volume! Finally, a bar chart is generated, and let's see the effect:
We can see that Disney's ticket sales rank first.
two。 Analysis on the ranking of Scenic spot sales
Sales = unit price * sales volume, we can multiply the price and sale of each line to calculate the sales volume
We put the sales data back into df, and then sort them.
Disney really sucks money!
3. Analysis on the number of scenic spots at different levels in each province
Due to time reasons, the analysis has not been completed yet. I wanted to analyze the number of scenic spots at each level in each province, but it has not been completed yet due to time reasons. Interested students can download the source code and try it for themselves.
4. Thermal Map Analysis of Scenic spot sales
We have done a lot of heat maps before, all using the pyecharts library, today we will do something different, we use Baidu Map Open api (free) to do a heat map, the first thing you need to do is to apply for an application of Baidu Map Open platform, the operation is very simple, how to apply can directly Baidu or see this article: https://jingyan.baidu.com/article/363872eccda8286e4aa16f4e.html
It should be noted that when applying for an application, you must choose a browser.
Then you can download a demo html of Baidu Heat Map and replace the AK code with your own in html.
After changing the ak code, we will change the json data. Our husband will change the json data in the same format as the default data, and then replace it.
Finally, let's take a look at the effect horn. Dynamic maps support zooming in and out, and you can carefully view the thermal maps of scenic spots in various provinces, cities and districts.
5. Recommended scenic spot analysis
What kind of scenic spots should be recommended? Brother Pig thinks it is: high score, low sales volume and low price.
The recommendation coefficient is directly proportional to the score, and inversely proportional to sales volume and price, so Brother Pig designed the simplest algorithm:
Blind recommendation coefficient = score / (sales price) * 1000
Take a look at the results of this simple recommendation algorithm.
This is the end of the content of "how to use Python to analyze National Day tourist attractions". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.