In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the knowledge of "how to use Request objects and Response objects in the python scrapy framework". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
1. Request object
The Request object is mainly used to request data. It is called when crawling a page of data and resending a request. The location of the source class is as follows
The following figure shows:
The source code is given here, and this method has many parameters:
Class Request (object_ref): def _ _ init__ (self, url, callback=None, method='GET', headers=None, body=None, cookies=None, meta=None, encoding='utf-8', priority=0, dont_filter=False, errback=None, flags=None Cb_kwargs=None): self._encoding = encoding # this one has to be set first self.method = str (method). Upper () self._set_url (url) self._set_body (body) if not isinstance (priority Int): raise TypeError (f "Request priority not an integer: {priorityroomr}") self.priority = priority if callback is not None and not callable (callback): raise TypeError (f'callback must be a callable, got {type (callback). _ _ name__}') if errback is not None and not callable (errback): raise TypeError (f'errback must be a callable) Got {type (errback). _ name__}') self.callback = callback self.errback = errback self.cookies = cookies or {} self.headers = Headers (headers or {}, encoding=encoding) self.dont_filter = dont_filter self._meta = dict (meta) if meta else None self._cb_kwargs = dict (cb_kwargs) if cb_kwargs else None self.flags = [] if flags is None else list (flags)
Here is a simple explanation for each:
Url: this request object sends the url of the request.
Callback: a callback function that is executed after the downloader downloads the corresponding data.
Method: the requested method, which defaults to the GET method, and can be set to other methods.
Headers: request headers. For some fixed settings, you can specify them in settings.py. For those that are not fixed, you can specify them when you send the request.
Body: the request body. The request parameters are passed in.
Meta: more commonly used. Used to pass data between different requests.
Encoding: encoding. The default is utf-8, just use the default.
Dont_filter: indicates that it is not filtered by the scheduler and is often used when executing repeated requests.
Errback: it is the function that executes when the error occurs.
2. Send POST request
Sometimes we want to send post requests when we request data, so we need to use FormRequest, a subclass of Request, to do so. If you want to send a POST request at the beginning of the crawler, you need to override the start_requests (self) method in the crawler and no longer call url in the start_urls.
3. Response object
Response objects are usually automatically built for you by scrapy, so developers don't need to care about how to create Response objects. It's about how to use it. The Response object has many properties that can be used to extract data.
The main attributes are as follows:
Meta: a meta attribute passed from other requests that can be used to maintain a data connection between multiple requests.
Encoding: returns the format of string encoding and decoding.
Text: returns the returned data as a unicode string
Body: returns the returned data as a bytes string.
Xpath: xpath selector
Css: css selector.
This is the end of the content of "how to use Request objects and Response objects in the python scrapy framework". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.