Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to find the keyword page in Python Requests crawler

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to find keywords in the Python Requests crawler page, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

Requirements: crawl the page data of Sogou's home page import requestsif _ _ name__=='__main__': # step 1: search Url url=' https://123.sogou.com/' # step 2: initiate a request # get method will return a response object response=requests.get (url=url) # step 3: get response data Text returns response data page_text=response.text print (page_text) # step 4: persistent storage with open ('. / sogou.html','w') in string form Encoding='utf-8') as fp: fp.write (page_text) print ("crawl data ends") import requestsif _ _ name__=='__main__': # step 1: search Url url=' https://123.sogou.com/' # step 2: initiate a request # get method returns a response object response=requests.get (url=url) # step 3: get response data Text returns response data page_text=response.text print (page_text) # step 4: persistent storage with open ('. / sogou.html','w',encoding='utf-8') as fp: fp.write (page_text) print ("crawl data end")

Use UA camouflage to get the keyword page import requestsif _ _ name__=='__main__': # UA camouflage: encapsulate the corresponding User-Agent in a dictionary headers= {'User-Agent':'Mozilla/5.0 (Windows NT 10.0) WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.9 Safari/537.36'} url=' https://www.sogou.com/sie?' # handles the parameters carried by url: encapsulated in a dictionary kw=input ('enter a word:') param= {' query':kw} # the url corresponding to the request for the specified url carries parameters And the parameter response=requests.get (url=url,params=param,headers=headers) # headers is disguised as params input keyword page_text=response.text# in the form of text output fileName=kw+'.html'# stored as web page form with open (fileName,'w+',encoding='utf-8') as fp: fp.write (page_text) # written to fp print (fileName, "saved successfully!")

These are all the contents of this article entitled "how to find keyword pages in Python Requests Crawler". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report