Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize the urllib network request in python package

2025-02-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to realize the urllib network request in the python package". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to realize the urllib network request in the python package".

I. brief introduction

Is a python built-in package that can be used without additional installation

Urllib is a Python standard library for network requests, with four built-in modules, namely

Urllib.request: used to open and read url, which can be used to simulate sending requests and get web page response content.

Urllib.error: used to handle exceptions caused by urllib.request to ensure the normal execution of the program

Urllib.parse: used to parse url, url can be split, merged, etc.

Urllib.robotparse: used to parse robots.txt files to determine whether a website can be crawled

2. Import urllib.request# method 1: resp = urllib.request.urlopen ('http://www.baidu.com', timeout=1) print (resp.read (). Decode (' utf-8')) # method 2 request = urllib.request.Request ('http://www.baidu.com')response = urllib.request.urlopen (request) print (response.read (). Decode (' utf-8')) 3. Request with parameters

You need to carry some data when you request some web pages.

Import urllib.parseimport urllib.requestparams = {'name':'autofelix','age':'25'} data = bytes (urllib.parse.urlencode (params), encoding='utf8') response = urllib.request.urlopen ("http://www.baidu.com/"," Data=data) print (response.read (). Decode ('utf-8')) 4. Get response data import urllib.requestresp = urllib.request.urlopen (' http://www.baidu.com')print(type(resp))print(resp.status)print(resp.geturl())print(resp.getcode())print(resp.info())print(resp.getheaders())print(resp.getheader('Server')) 5. Set headersimport urllib.requestheaders = {'User-Agent':' Mozilla/5.0 (Windows NT 6.1) Win64 X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'} request = urllib.request.Request (url= "http://tieba.baidu.com/", headers=headers) response = urllib.request.urlopen (request) print (response.read (). Decode ('utf-8')) VI. Use proxy import urllib.requestproxys = urllib.request.ProxyHandler ({' http': 'proxy.cn:8080') 'https':' proxy.cn:8080'}) opener = urllib.request.build_opener (proxys) urllib.request.install_opener (opener) request = urllib.request.Request (url= "http://www.baidu.com/")response = urllib.request.urlopen (request) print (response.read (). Decode ('utf-8')) VII. Authentication login

Some websites need to log in with an account number and password before they can continue to browse the web.

Import urllib.requesturl = "http://www.baidu.com/"user = 'autofelix'password =' 123456'pwdmgr = urllib.request.HTTPPasswordMgrWithDefaultRealm () pwdmgr.add_password (None,url,user,password) auth_handler = urllib.request.HTTPBasicAuthHandler (pwdmgr) opener = urllib.request.build_opener (auth_handler) response = opener.open (url) print (response.read (). Decode ('utf-8')) VIII. Set cookie

If the requested page requires authentication each time, we can use Cookies to log in automatically to avoid repeated login verification.

Import http.cookiejarimport urllib.requestcookie = http.cookiejar.CookieJar () handler = urllib.request.HTTPCookieProcessor (cookie) opener = urllib.request.build_opener (handler) response = opener.open ("http://www.baidu.com/")f = open ('cookie.txt', 'a') for item in cookie:f.write (item.name+" = "+ item.value+'\ n') f.close () IX. Exception handling from urllib import error Requesttry:resp = request.urlopen ('http://www.baidu.com')except error.URLError as e:print (e.reason) 10, HTTP exception from urllib import error, requesttry:resp = request.urlopen (' http://www.baidu.com')except error.HTTPError as e:print (e.reason, e.code, e.headers, sep='\ n') except error.URLError as e:print (e.reason) else:print ('request successfully') 11, timeout exception import socket, urllib.request Urllib.errortry:resp = urllib.request.urlopen ('http://www.baidu.com', timeout=0.01) except urllib.error.URLError as e:print (type (e.reason)) if isinstance (e.reasonj socket.timeout): print (' timeout') 12, parsed code from urllib import parsename = parse.quote ('flying rabbit') # converted back to parse.unquote (name) 13, parameter splicing

When accessing url, we often need to pass a lot of url parameters

However, it will be more troublesome to concatenate url by string method.

From urllib import parseparams = {'name':' Flying Rabbit', 'age':' 279, 'height':' 178'} parse.urlencode (params) XIV. Request link parsing from urllib.parse import urlparseresult = urlparse ('http://www.baidu.com/index.html?user=autofelix')print(type(result))print(result) XV), splicing link

If two links are stitched together, return to the following link

If the stitching is a link and parameter, the stitched content is returned

From urllib.parse import urljoinprint (urljoin ('http://www.baidu.com',' index.html')) XVI. Dictionary conversion parameter from urllib.parse import urlencodeparams = {'name':' autofelix','age': 27} baseUrl = 'http://www.baidu.com?'print(baseUrl + urlencode (params)) Thank you for reading. This is the content of "how to realize the urllib network request in the python package". After the study of this article I believe you have a deeper understanding of how to realize the urllib network request in the python package, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report