In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly introduces "how to understand the process of obtaining web page data by Python". In daily operation, I believe many people have doubts about how to understand the process of obtaining web page data by Python. The editor has consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "how to understand the process of obtaining web page data by Python". Next, please follow the editor to study!
The Requests library is the library that initiates HTTP requests in Python, which is very easy to use.
Send GET request
When we use the browser to open the home page of Dongxu Lantian stock, the original request sent is the GET request, passing in the url parameter.
Import requestsurl=' http://push3his.eastmoney.com/api/qt/stock/fflow/daykline/get'
Use the get function of Python requests library to get the data and set the request header of requests.
Header= {'User-Agent':' Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'}
Get the parameters of network.
Data= {'cb':' jQuery1123026726575651052076_1633873068863', 'lmt':' 0th, 'klt':' 101', 'fields1':' f1rec f2meme f3recorder f7qiang, 'fields2':' f51pr f51e f53e f53le f54e f567MIT f58MIT f587MIT f59MIT f591MIT f61MIT f2MIT f63MH f64m62a393a59ad64002292a3e90d46a5blocks,'_': '1633873068864'}
We use the content property to get the data returned by the site and name it sd. Exe.
Sd=requests.get (url=url,headers=header,data=data). Content
The json library can parse JSON from a string or file. The library parses JSON and converts it into an Python dictionary or list. Re module is a unique python module for matching strings. Many functions provided in this module are based on regular expressions, which fuzzily match strings and extract the part of strings they need.
Import jsonimport retext=str (sd,'utf-8') res=re.findall (r'[(] (. *) [)', text) re=json.loads (res [0]) p=re ['data'] [' klines']
Typeset the disorganized data into excel, as follows:
All_list=re ['data'] [' klines'] data_list= [] latest_price_list= [] price_limit_list= [] net_proportion_list1= [] net_amount_list2= [] net_proportion_list2= [] net_amount_list3= [] net_proportion_list3= [] net_amount_list4= [] net_proportion_list4= [] net_amount_list5= [] net_proportion_list5= [] for i in range (len (all_list)): data=all_ list [I] .split (' ') [0] data_list.append (data) # # closing price latest_price=all_ list [I] .split (',') [11] latest_price_list.append (latest_price) # # up and down price_limit=all_ list [I] .split (' ') [12] price_limit_list.append (price_limit) # # main net inflow # net net_amount1=all_ list [I] .split (',') [1] net_amount_list1.append (net_amount1) # # share of net_proportion1=all_ list [I] .split (' ') [6] net_proportion_list1.append (net_proportion1) # # extra large single net inflow # net net_amount2=all_ list [I] .split (',') [5] net_amount_list2.append (net_amount2) # # share of net_proportion2=all_ list [I] .split (' ') [10] net_proportion_list2.append (net_proportion2) # # large single net inflow # net net_amount3=all_ list [I] .split (',') [4] net_amount_list3.append (net_amount3) # # share of net_proportion3=all_ list [I] .split (' ') [9] net_proportion_list3.append (net_proportion3) # # single net inflow # net net_amount4=all_ list [I] .split (',') [3] net_amount_list4.append (net_amount4) # # share of net_proportion4=all_ list [I] .split (' ') [8] net_proportion_list4.append (net_proportion4) # # small net inflow # net net_amount5=all_ list [I] .split (',') [2] net_amount_list5.append (net_amount5) # # share of net_proportion5=all_ list [I] .split (' ') [7] net_proportion_list5.append (net_proportion5) # print (data_list) import pandas as pddf=pd.DataFrame () df [' date'] = data_listdf ['closing price'] = latest_price_listdf ['up or down (%)'] = price_limit_listdf ['main net inflow-net inflow'] = net_amount_list1df ['main net inflow-net share (%)'] = net_proportion_list1df ['Super large' Single net inflow-net inflow'] = net_amount_list2df ['super single net inflow-net share (%)'] = net_proportion_list2df ['large single net inflow-net inflow'] = net_amount_list3df ['large single net inflow-net share (%)'] = net_proportion_list3df ['medium net inflow-net inflow] = net_amount_list4df [' medium net inflow-net share (%)'] = net_proportion_list4df [' Small net inflow-net inflow'] = net_amount_list5df ['small net inflow-net share (%)'] = net_proportion_list5df# is written to exceldf.to_excel ('Dongxu Blue Sky Capital flow Table .xlsx')
Save the capital flow data of Dongxu Blue Sky in the excel table, and get some screenshots of the table as follows:
At this point, the study on "how to understand Python's process of obtaining web page data" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.