Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use python to climb the city bus stop

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "how to use python to climb a city bus stop". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Page analysis

Https://guiyang.8684.cn/line1

Reptile

We use requests requests and BeautifulSoup to parse and get our site data. After getting our bus stop, we use Gaud api to obtain the longitude and latitude coordinates of the station, and use pandas to parse the json file. Next, I recommend using an object-oriented approach to writing code.

Import requestsimport jsonfrom bs4 import BeautifulSoupimport pandas as pdclass bus_stop: # # define a class Used to obtain the stop name and latitude and longitude of each bus def _ _ init__ (self): self.url = 'https://guiyang.8684.cn/line{}' self.starnum = [] for start_num in range (1 17): self.starnum.append (start_num) self.payload = {} self.headers = {'Cookie':' JSESSIONID=48304F9E8D55A9F2F8ACC14B7EC5A02D'} # # call Gaud api to obtain the longitude and latitude of the bus line # this key you can apply for def get_location (self) Line): url_api = 'https://restapi.amap.com/v3/bus/linename?s=rsv3&extensions=all&key=559bdffe35eec8c8f4dae959451d705c&output=json&city= Guiyang & offset=2&keywords= {} & platform=JS'.format (line) res = requests.get (url_api). Text # print (res) can be used to verify whether the returned information contains the data you need rt = json.loads (res) dicts = rt [' buslines'] [0] # return df object df = pd .DataFrame.from _ dict ([dicts]) return df # # get the stop name of each bus def get_line (self): for start in self.starnum: start = str (start) # construct url url = self.url.format (start) res = requests.request ("GET") Url, headers=self.headers, data=self.payload) soup = BeautifulSoup (res.text, "lxml") div = soup.find ('div' Class_='list clearfix') lists = div.find_all ('a') for item in lists: line = item.text # get the bus line lines.append (line) return linesif _ _ name__ ='_ main__': bus_stop = bus_stop () stop_df = pd.DataFrame ([]) lines = [] bus_stop.get_line () # output route print ('there are {} bus routes in total) '.format (len (lines)) print (lines) # exception handling error_lines = [] for line in lines: try: df = bus_stop.get_location (line) stop_df = pd.concat ([stop_df Df], axis=0) except: error_lines.append (line) # output abnormal route print ('abnormal route has {} bus routes' .format (len (error_lines) print (error_lines) # output file size print (stop_df.shape) stop_df.to_csv ('bus_stop.csv', encoding='gbk', index=False)

Data cleaning

Let's see the effect first. I need to clean the busstops column. Our overall idea is: division-> reverse perspective-> division. I will accept two methods, one is Excel PQ, the other is python.

Excel PQ data cleaning

This method makes full use of PQ, pure interface operation, not a big problem, so let's just look at the process, the core steps are the same as above.

Python data cleaning # # data columns and ID columns we need to deal with: stop_df [['id','busstops']] data.head ()

# # Dictionary or list df_pol = data.copy () # set the index column df_pol.set_index ('id',inplace=True) df_pol.head ()

# # inverse perspective # release index df_pol.reset_index (inplace=True) # inverse perspective operation df_pol_ps = df_pol.melt (id_vars= ['id'], value_name='busstops') df_pol_ps.head ()

# # deleting a blank line df_pol_ps.dropna (inplace=True,axis=0) df_pol_ps.shape

# # separate # set line_iddf_parse ['line_id'] = df_pol_ps [' id'] df_parse = df_pol_ps ['busstops'] .apply (pd.Series) df_parse

I would like to add here that we usually have to sort out the location column and sort out the Long,lat, but we don't do it here, it's all repetitive work, and the pq cleaning I use is much faster.

# # Writing file df_parse.to_excel ('bus stop distribution in Guiyang. Xlsx', index=False) QGIS coordinate correction

I won't talk about the basic operation of QGIS. By the way, QGIS supports csv format better. I recommend that we import QGIS files in csv format.

Import csv Fil

Coordinate correction

As I said a lot before, the coordinates on our Amap are GCJ02 coordinates, we need to convert them to WGS 1984 coordinates, and we need to use GeoHey plug-ins in QGIS.

This is the end of the content of "how to use python to climb the city bus stop". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 294

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report