In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you about the use of Cookie in Pytho, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
Next let's take a look at the use of Cookie.
Why use Cookie?
Cookie refers to the data (usually encrypted) stored on the user's local terminal by some websites in order to identify users and perform session tracking.
For example, some websites need to log in before you can access a certain page, and it is not allowed for you to crawl the content of a certain page before logging in. Then we can use the Urllib2 library to save the Cookie we logged in, and then crawl other pages to achieve our goal. (Xiamen electric forklift python support)
Before that, we must first introduce a concept of opener.
1.Opener
When you get a URL, you use an opener (an instance of urllib2.OpenerDirector). Previously, we all used the default opener, which is urlopen. It is a special opener, which can be understood as a special instance of opener, and the parameter passed in is just url,timeout.
If we need to use Cookie, we can't do it with this opener alone, so we need to create a more general opener to set up Cookie.
2.Cookielib
The main function of the cookielib module is to provide objects that can store cookie so that it can be used with the urllib2 module to access Internet resources. The Cookielib module is so powerful that we can use the objects of the CookieJar class of this module to capture the cookie and resend it on subsequent connection requests, such as the ability to simulate login. The main objects of this module are CookieJar, FileCookieJar, MozillaCookieJar and LWPCookieJar.
Their relationship: CookieJar-- derivative-- > FileCookieJar-- derivative-- > MozillaCookieJar and LWPCookieJar
1) get the Cookie saved to the variable
First of all, let's use the CookieJar object to achieve the function of getting cookie. In the variable, let's first feel it.
Python
Import urllib2
Import cookielib
# declare an instance of a CookieJar object to hold cookie
Cookie = cookielib.CookieJar ()
# create a cookie processor using the HTTPCookieProcessor object of the urllib2 library
Handler=urllib2.HTTPCookieProcessor (cookie)
# build opener through handler
Opener = urllib2.build_opener (handler)
# the open method here is the same as the urlopen method of urllib2, or you can pass in request
Response = opener.open ('http://www.baidu.com')
For item in cookie:
Print 'Name =' + item.name
Print 'Value =' + item.value
We use the above method to save cookie to a variable, and then print out the value in cookie. The result is as follows
Python
Name = BAIDUID
Value = B07B663B645729F11F659C02AAE65B4C:FG=1
Name = BAIDUPSID
Value = B07B663B645729F11F659C02AAE65B4C
Name = H_PS_PSSID
Value = 1252711076143810633
Name = BDSVRTM
Value = 0
Name = BD_HOME
Value = 0
2) Save Cookie to file
In the above method, we saved cookie to the variable cookie. What if we want to save cookie to a file? At this point, we're going to need it.
FileCookieJar this object, here we use its subclass MozillaCookieJar to achieve the preservation of Cookie
Python
Import cookielib
Import urllib2
# set the file to save cookie and cookie.txt in the same directory
Filename = 'cookie.txt'
# declare a MozillaCookieJar object instance to save the cookie, and then write to the file
Cookie = cookielib.MozillaCookieJar (filename)
# create a cookie processor using the HTTPCookieProcessor object of the urllib2 library
Handler = urllib2.HTTPCookieProcessor (cookie)
# build opener through handler
Opener = urllib2.build_opener (handler)
# create a request based on the same principle as urllib2's urlopen
Response = opener.open ("http://www.baidu.com")
# Save cookie to file
Cookie.save (ignore_discard=True, ignore_expires=True)
The two parameters of the final save method are explained here:
The official explanation is as follows:
Ignore_discard: save even cookies set to be discarded.
Ignore_expires: save even cookies that have expiredThe file is overwritten if it already exists
Thus, ignore_discard means to save the cookies even if it will be discarded, and ignore_expires means that if the cookies already exists in the file, the original file is overwritten, and here, we set both to True. After running, the cookies will be saved to the cookie.txt file. Let's take a look at the contents, as shown below.
3) get the Cookie from the file and visit
So we have saved the Cookie to a file. If you want to use it later, you can use the following method to read the cookie and visit the website to feel it.
Python
Import cookielib
Import urllib2
# create MozillaCookieJar instance object
Cookie = cookielib.MozillaCookieJar ()
# read cookie contents to variables from a file
Cookie.load ('cookie.txt', ignore_discard=True, ignore_expires=True)
# create the requested request
Req = urllib2.Request ("http://www.baidu.com")
# create an opener using the build_opener method of urllib2
Opener = urllib2.build_opener (urllib2.HTTPCookieProcessor (cookie))
Response = opener.open (req)
Print response.read ()
Imagine that if the cookie of a person logging in to Baidu is saved in our cookie.txt file, then we can extract the contents of this cookie file and use the above method to simulate that person's account to log in to Baidu.
4) use cookie to simulate website login
Let's take our school as an example, use cookie to achieve simulated login, and save the cookie information to a text file, to feel the cookie Dafa!
Note: I have changed the password, do not secretly log on to my course selection system o (╯□╰) o
Python
Import urllib
Import urllib2
Import cookielib
Filename = 'cookie.txt'
# declare a MozillaCookieJar object instance to save the cookie, and then write to the file
Cookie = cookielib.MozillaCookieJar (filename)
Opener = urllib2.build_opener (urllib2.HTTPCookieProcessor (cookie))
Postdata = urllib.urlencode ({
'stuid':'201200131012'
'pwd':'23342321'
})
# Log in to the URL of educational administration system
LoginUrl = 'http://jwxt.sdu.edu.cn:7890/pls/wwwbks/bks_login2.login'
# simulate login and save cookie to variable
Result = opener.open (loginUrl,postdata)
# Save cookie to cookie.txt
Cookie.save (ignore_discard=True, ignore_expires=True)
# use cookie to request access to another web site, which is the results query URL
GradeUrl = 'http://jwxt.sdu.edu.cn:7890/pls/wwwbks/bkscjcx.curscopre'
# request to visit the results query website
Result = opener.open (gradeUrl)
Print result.read ()
The principle of the above program is as follows
Create an opener with cookie. When you access the logged-in URL, save the logged-in cookie, and then use this cookie to access other URLs.
Such as the query that can only be checked after logging in, the class schedule for this semester, and so on, the simulated login is realized in this way, isn't it cool?
All right, guys, come on! We can get the website information smoothly now, and the next step is to extract the valid content from the website.
The above is the use of Cookie in Pytho. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.