Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to send http requests with socket

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

How to use socket to send http request, in view of this problem, this article introduces the corresponding analysis and answer in detail, hoping to help more partners who want to solve this problem to find a more simple and easy way.

Usually we use browsers to browse web resources and write crawlers, we use packaged libraries such as requests, or use crawler frameworks. If you want to do a good job, you must first sharpen its tools, and the top-level packaged things are for us to use conveniently and save development time. Although various http libraries are powerful, learning the underlying technology still has practical significance. Only by understanding the underlying layer, can you really understand the top-level packaging and design. When you encounter those difficult problems, there will be ideas and solutions.

1. No matter using socket to send http request browser or crawler framework, at the bottom, we all use socket to send http request, and then receive the data returned by the server. The browser will render the returned data and finally present it in front of us. Compared with the browser, the crawler framework only has one less rendering process. To send a http request with socket, you first need to establish a TCP socket, and then connect to the 3-way handshake requested by the server socket,http, which is essentially a 3-way handshake for TCP socket to establish a connection. After the connection is established, it is time to send the data. The data here is not sent at will, but in accordance with the http protocol. The following code demonstrates the socket process of sending http requests, import socket.

Url = 'www.zhangdongshengtech.com'

Port = 80

# create TCP socket

Sock = socket.socket (socket.AF_INET, socket.SOCK_STREAM)

# connect the server

Sock.connect ((url, port))

# create a request header

Request_url = 'GET / article-types/6/ HTTP/1.1\ r\ nHost: www.zhangdongshengtech.com\ r\ nConnection: close\ r\ n\ n'

Print (request_url)

# send request

Sock.send (request_url.encode ())

Response = baked'

# receive the returned data

Rec = sock.recv (1024)

While rec:

Response + = rec

Rec = sock.recv (1024)

Print (response.decode ())

The code is very simple, without too much explanation, what we focus on is the http protocol, and the content of request_url is GET / article-types/6/ HTTP/1.1.

Host: www.zhangdongshengtech.com

Connection: close

The request header, each line has its own role, between the header and the message body, there are two line breaks, because we sent a GET request, there is no message body, so the two line breaks are over. Each of these three lines is of vital importance. The first line contains three important pieces of information

GET indicates the method used for this request, which is a GET request

/ article-types/6/ indicates the address of the resource to be requested

HTTP/1.1 specifies the version of the http protocol, which was 1.0 earlier, but now everyone is using 1.1.

The content of the second line indicates that host, there may be more than one web service deployed on a server, they all have port 80, url = 'www.zhangdongshengtech.com' just tells socket where to establish a connection, this is just a domain name, the program finds the IP address according to the domain name, if the server deploys multiple services, in order to distinguish which service a request is directed to. The client needs to indicate the host in the request header, and the server will forward the request according to this host, which is often called nginx reverse proxy. The third line, defines the value of Connection is close, if not defined, the default is keep-alive, if it is keep-alive, then the server will not disconnect after returning data, but allow the client to continue to use this connection to send requests, I deliberately set it to close, the purpose is to let the server disconnect actively, so that when the program is using the while loop, after receiving all the data, sock.recv (1024) returns None In this way, you can stop the program. If it is keep-alive, it is impossible to determine whether all the data has been received by breaking the connection, so the length of the data can only be obtained through the message header that returns the data, and then determine where the data returned by this request ends. two。 The returned message body program outputs the data returned by the server. Because of the large amount of data, we only intercept the part of the message header for explanation. The message body is just the source code of the web page, and there is nothing to say. HTTP/1.1 200 OK

Server: openresty/1.11.2.1

Date: Sun, 05 May 2019 03:11:05 GMT

Content-Type: text/html; charset=utf-8

Content-Length: 29492

Connection: close

Set-Cookie: session=eyJjc3JmX3Rva2VuIjp7IiBiIjoiTn

Prd1pqZGhaamd6T1dObFlUQTRZVFJqTkRJeU9USmtNalU0TldOaU1UQXdNamsxTkdSaVpRPT0ifX0.D6_lyQ.

4EqkK8taszUkPtMsolmur8pzFlowLQM; HttpOnly; Path=/

In the returned data, there are also two line breaks between the message header and the message body. In the first line, HTTP/1.1 200OK indicates the version of the http protocol. The status code returned for this request indicates a successful response. More commonly, there are 404, 500, and 302 status codes. You can check the meaning of these status codes by yourself. In the header content at the beginning of the second line, the more important is Content-Length, whose value is 29492, which indicates that the length of the message body is 29492, and if the value of Connection is keep-alive, the client has to read the canceling body according to this value. In the message body of the request and the message body of the server response, except for the first line, the other key-value pairs look like dictionaries, called headers, this article only involves a few headers, other headers and their meaning, you can use your own Baidu. 3. Parsing the content_length from the header makes it a little more difficult to get the content_length from the header. When getting the returned data, when the length of the message body meets the requirements, stop getting the data and close the connection import socket.

Url = 'www.zhangdongshengtech.com'

Port = 80

# create TCP socket

Sock = socket.socket (socket.AF_INET, socket.SOCK_STREAM)

# connect the server

Sock.connect ((url, port))

# create a request header

Request_url = 'GET / article-types/6/ HTTP/1.1\ r\ nHost: www.zhangdongshengtech.com\ r\ n\ n'

Print (request_url)

# send request

Sock.send (request_url.encode ())

Body =''

# receive the returned data

Rec = sock.recv (1024)

Index = rec.find (b'\ r\ n\ r\ n\ n\ n') # find the place where the header is separated from the message body

Head = rec [: index]

Body = rec [index+4:]

# get Content-Length

Headers = head.split (b'\ r\ n')

For header in headers:

If header.startswith (baked Contenttel Length'):

Content_length = int (header.split (b'') [1])

Length = len (body)

While length < content_length:

Rec = sock.recv (1024)

Length + = len (rec)

Body + = rec

Sock.close ()

Print (length)

Print (head.decode ())

Print (body.decode ()) on how to use socket to send http request questions to share here, I hope the above content can be of some help to you, if you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report