Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Request object and Response object in python scrapy Framework

2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the knowledge of "how to use Request objects and Response objects in the python scrapy framework". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Request object

The Request object is mainly used to request data. It is called when crawling a page of data and resending a request. The location of the source class is as follows

The following figure shows:

The source code is given here, and this method has many parameters:

Class Request (object_ref): def _ _ init__ (self, url, callback=None, method='GET', headers=None, body=None, cookies=None, meta=None, encoding='utf-8', priority=0, dont_filter=False, errback=None, flags=None Cb_kwargs=None): self._encoding = encoding # this one has to be set first self.method = str (method). Upper () self._set_url (url) self._set_body (body) if not isinstance (priority Int): raise TypeError (f "Request priority not an integer: {priorityroomr}") self.priority = priority if callback is not None and not callable (callback): raise TypeError (f'callback must be a callable, got {type (callback). _ _ name__}') if errback is not None and not callable (errback): raise TypeError (f'errback must be a callable) Got {type (errback). _ name__}') self.callback = callback self.errback = errback self.cookies = cookies or {} self.headers = Headers (headers or {}, encoding=encoding) self.dont_filter = dont_filter self._meta = dict (meta) if meta else None self._cb_kwargs = dict (cb_kwargs) if cb_kwargs else None self.flags = [] if flags is None else list (flags)

Here is a simple explanation for each:

Url: this request object sends the url of the request.

Callback: a callback function that is executed after the downloader downloads the corresponding data.

Method: the requested method, which defaults to the GET method, and can be set to other methods.

Headers: request headers. For some fixed settings, you can specify them in settings.py. For those that are not fixed, you can specify them when you send the request.

Body: the request body. The request parameters are passed in.

Meta: more commonly used. Used to pass data between different requests.

Encoding: encoding. The default is utf-8, just use the default.

Dont_filter: indicates that it is not filtered by the scheduler and is often used when executing repeated requests.

Errback: it is the function that executes when the error occurs.

2. Send POST request

Sometimes we want to send post requests when we request data, so we need to use FormRequest, a subclass of Request, to do so. If you want to send a POST request at the beginning of the crawler, you need to override the start_requests (self) method in the crawler and no longer call url in the start_urls.

3. Response object

Response objects are usually automatically built for you by scrapy, so developers don't need to care about how to create Response objects. It's about how to use it. The Response object has many properties that can be used to extract data.

The main attributes are as follows:

Meta: a meta attribute passed from other requests that can be used to maintain a data connection between multiple requests.

Encoding: returns the format of string encoding and decoding.

Text: returns the returned data as a unicode string

Body: returns the returned data as a bytes string.

Xpath: xpath selector

Css: css selector.

This is the end of the content of "how to use Request objects and Response objects in the python scrapy framework". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report