In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains "how to use python crawler to get the information on Renren". The content in the article is simple and clear, and it is easy to learn and understand. below, please follow the editor's train of thought to study and learn "how to use python crawler to crawl the information above Renren".
Requests provides a class called session to implement session persistence between the client and the server.
Usage
1. Instantiate a session object
two。 Have session send a get or post request
Session = requests.session () session.get (url,headers)
Next, let's use Renren for actual combat.
# coding=utf-8import requestssession = requests.session () # Login form urlpost_url = "http://www.renren.com/PLogin.do"post_data = {" email ":" your_email "," password ":" your_password "} headers = {" User-Agent ":" Mozilla/5.0 (Macintosh) Intel Mac OS X 10 / 13 / 2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36 "} # sends post requests using session Cookie is saved in it session.post (post_url, data=post_data, headers=headers) # address that can only be accessed after requesting login using session # this is the personal home page urlr = session.get ("http://www.renren.com/327550088/profile", headers=headers) # Save the page to the local with open (" renren1.html "," w ", encoding=" utf-8 ") as f: f.write (r.content.decode ('utf-8'))
It's as simple as that, simulate logging in to Renren and get the personal home page information page and save it locally.
In fact, the login status of the website is recorded through the information carried in the cookie. If we send the request with the login cookie, whether we can access the page that can only be accessed by login, of course we can.
Please look at the code
# coding=utf-8import requestsheaders = {"User-Agent": "Mozilla/5.0 (Macintosh Intel Mac OS X 10 / 13 / 2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36 "," Cookie ":" your login cookie "} r = requests.get (" http://www.renren.com/327550088/profile",headers=headers)# save page with open ("renren2.html", "w", encoding= "utf-8") as f: f.write (r.content.decode ())
As you can see, Cookie can be placed in headers. In fact, there is also a parameter in requests to pass cookie. This parameter is cookies.
Please look at the code
# usage of dictionary generator cookies = {i.split ("=") [0]: i.split ("=") [1] for i in cookies.split (" ")} print (cookies) r = requests.get (" http://www.renren.com/327550088/profile",headers=headers,cookies=cookies) thank you for reading, the above is the content of "how to use python crawler to crawl the information on Renren". After the study of this article, I believe you have a deeper understanding of how to use python crawler to crawl the information above Renren, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.