In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "how to realize the urllib network request in the python package". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to realize the urllib network request in the python package".
I. brief introduction
Is a python built-in package that can be used without additional installation
Urllib is a Python standard library for network requests, with four built-in modules, namely
Urllib.request: used to open and read url, which can be used to simulate sending requests and get web page response content.
Urllib.error: used to handle exceptions caused by urllib.request to ensure the normal execution of the program
Urllib.parse: used to parse url, url can be split, merged, etc.
Urllib.robotparse: used to parse robots.txt files to determine whether a website can be crawled
2. Import urllib.request# method 1: resp = urllib.request.urlopen ('http://www.baidu.com', timeout=1) print (resp.read (). Decode (' utf-8')) # method 2 request = urllib.request.Request ('http://www.baidu.com')response = urllib.request.urlopen (request) print (response.read (). Decode (' utf-8')) 3. Request with parameters
You need to carry some data when you request some web pages.
Import urllib.parseimport urllib.requestparams = {'name':'autofelix','age':'25'} data = bytes (urllib.parse.urlencode (params), encoding='utf8') response = urllib.request.urlopen ("http://www.baidu.com/"," Data=data) print (response.read (). Decode ('utf-8')) 4. Get response data import urllib.requestresp = urllib.request.urlopen (' http://www.baidu.com')print(type(resp))print(resp.status)print(resp.geturl())print(resp.getcode())print(resp.info())print(resp.getheaders())print(resp.getheader('Server')) 5. Set headersimport urllib.requestheaders = {'User-Agent':' Mozilla/5.0 (Windows NT 6.1) Win64 X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'} request = urllib.request.Request (url= "http://tieba.baidu.com/", headers=headers) response = urllib.request.urlopen (request) print (response.read (). Decode ('utf-8')) VI. Use proxy import urllib.requestproxys = urllib.request.ProxyHandler ({' http': 'proxy.cn:8080') 'https':' proxy.cn:8080'}) opener = urllib.request.build_opener (proxys) urllib.request.install_opener (opener) request = urllib.request.Request (url= "http://www.baidu.com/")response = urllib.request.urlopen (request) print (response.read (). Decode ('utf-8')) VII. Authentication login
Some websites need to log in with an account number and password before they can continue to browse the web.
Import urllib.requesturl = "http://www.baidu.com/"user = 'autofelix'password =' 123456'pwdmgr = urllib.request.HTTPPasswordMgrWithDefaultRealm () pwdmgr.add_password (None,url,user,password) auth_handler = urllib.request.HTTPBasicAuthHandler (pwdmgr) opener = urllib.request.build_opener (auth_handler) response = opener.open (url) print (response.read (). Decode ('utf-8')) VIII. Set cookie
If the requested page requires authentication each time, we can use Cookies to log in automatically to avoid repeated login verification.
Import http.cookiejarimport urllib.requestcookie = http.cookiejar.CookieJar () handler = urllib.request.HTTPCookieProcessor (cookie) opener = urllib.request.build_opener (handler) response = opener.open ("http://www.baidu.com/")f = open ('cookie.txt', 'a') for item in cookie:f.write (item.name+" = "+ item.value+'\ n') f.close () IX. Exception handling from urllib import error Requesttry:resp = request.urlopen ('http://www.baidu.com')except error.URLError as e:print (e.reason) 10, HTTP exception from urllib import error, requesttry:resp = request.urlopen (' http://www.baidu.com')except error.HTTPError as e:print (e.reason, e.code, e.headers, sep='\ n') except error.URLError as e:print (e.reason) else:print ('request successfully') 11, timeout exception import socket, urllib.request Urllib.errortry:resp = urllib.request.urlopen ('http://www.baidu.com', timeout=0.01) except urllib.error.URLError as e:print (type (e.reason)) if isinstance (e.reasonj socket.timeout): print (' timeout') 12, parsed code from urllib import parsename = parse.quote ('flying rabbit') # converted back to parse.unquote (name) 13, parameter splicing
When accessing url, we often need to pass a lot of url parameters
However, it will be more troublesome to concatenate url by string method.
From urllib import parseparams = {'name':' Flying Rabbit', 'age':' 279, 'height':' 178'} parse.urlencode (params) XIV. Request link parsing from urllib.parse import urlparseresult = urlparse ('http://www.baidu.com/index.html?user=autofelix')print(type(result))print(result) XV), splicing link
If two links are stitched together, return to the following link
If the stitching is a link and parameter, the stitched content is returned
From urllib.parse import urljoinprint (urljoin ('http://www.baidu.com',' index.html')) XVI. Dictionary conversion parameter from urllib.parse import urlencodeparams = {'name':' autofelix','age': 27} baseUrl = 'http://www.baidu.com?'print(baseUrl + urlencode (params)) Thank you for reading. This is the content of "how to realize the urllib network request in the python package". After the study of this article I believe you have a deeper understanding of how to realize the urllib network request in the python package, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.