Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the workflow of the HTTP protocol?

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

Most people do not understand the knowledge points of this article "what is the workflow of HTTP protocol", so the editor summarizes the following contents, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "what is the workflow of HTTP protocol" article.

HTTP protocol is the abbreviation of Hyper Text Transfer Protocol (Hypertext transfer Protocol). It is a transport protocol used to transfer hypertext from a WWW:World Wide Web server to a local browser.

A brief introduction to HTTP protocol

As the beginning of learning front-end development, we must understand the following things

1. What is the Internet

Internet = physical connection Media + Internet Protocol

2. The purpose of the establishment of the Internet?

Data transmission breaks the regional limit, otherwise, I want to get the data on the other host, so I can only take the hard drive to the other host to copy.

3. What is the Internet?

The process of users surfing the Internet is the process that the browser sends a request to the server, and then downloads the text file of the server host to the local display. The HTTP protocol between the browser and the server.

We learn the front-end development is to arrange a text file to be stored in the server host, and then provide it to the browser to download and display (the browser client has two main functions: one is to send a download request to the server, and the other is to render the received code data into a web page that the user can browse.) so before learning the front-end development, we must first study the HTTP protocol.

This article has been published before, and has been reprinted many times with many technical techniques. Due to the improper use of the article classification label at that time, it was later deleted. In writing this article, one is to review the previous content, but hope to better understand and share it with you.

# 1. HTTP protocol, the full name Hyper Text Transfer Protocol (Hypertext transfer Protocol) HTTP protocol is used to transfer hypertext from a (WWW:World Wide Web) server to a local browser. # 2. The HTTP protocol works on the Bhand S architecture. As a HTTP client, the browser sends a request Request to the HTTP server through URL, that is, the WEB server. According to the received request, the Web server sends the response information Response to the client. # 3. HTTP protocol is based on TCP/IP communication protocol to transfer data (HTML files, picture files, etc.)

So far, the development of HTTP protocol has experienced three versions of evolution.

The first HTTP agreement was born in March 1989 and is out of date. # 1: its composition is extremely simple: # 1. Only clients are allowed to send GET requests # 2. Request headers are not supported. # 3. Because there is no request header, HTTP 0.9 protocol only supports one kind of content, that is, plain text. However, the web page still supports formatting in HTML and cannot insert images. # 2: stateless # 1, HTTP 0.9 is typically stateless, each transaction is processed independently, and the connection is released at the end of the transaction. The detailed explanation is as follows: a transfer of HTTP 0.9 first establishes a TCP connection from the client to the Web server, the client initiates a request, then the Web server returns the page content, and then the connection is closed. If the requested page does not exist, no error code will be returned. # 2. It can be seen that the stateless feature of the HTTP protocol has been formed in its first version 0.9. # III: HTTP 0.9 protocol documentation: http://www.w3.org/Protocols/HTTP/AsImplemented.htmlHTTP/1.0 is the second version of the HTTP protocol, which is still widely used today. The version number is specified in the communication for the first time, and the following main features are added compared with HTTP 0.9. support request header and response header # 2. Response response starts with a response status line. Response contains more than hypertext # 3. It starts to support clients to submit data to Web servers through the POST method. And support GET, HEAD, POST method # 4, support long connection Keepalive (but still use short connection by default) # 5, caching mechanism and identity authentication

Please take a look at the introduction below. 1.1 is mainly used now.

HTTP 2.0 is the next generation HTTP protocol, and its applications are very few at present. The main features are: # 1, the biggest features of multiplexing (binary framing) HTTP 2.0: do not change the semantics of HTTP, HTTP method, status code, URI and header fields, and other core concepts as usual, but can strive to break through the performance limitations of the previous generation standards, improve transmission performance, and achieve low latency and high throughput. The reason why it is called 2.0 lies in the new binary framing layer. At the binary framing layer, HTTP 2.0 divides all transmitted information into smaller messages and frames and encodes them in binary format, in which the header information of HTTP1.x is encapsulated in Headers frames, while our request body is encapsulated in Data frames. HTTP 2.0 communication is done on a single connection that can carry any number of two-way data streams. Accordingly, each data stream is sent in the form of a message, which consists of one or more frames, which can be sent out of order and then reassembled according to the flow identifier at the beginning of each frame. Header compression when a client requests many resources from the same server, such as images from the same web page, there will be a large number of requests that look almost the same, which requires compression technology to deal with almost the same information. # 3. One disadvantage of resetting HTTP1.1 at any time is that when HTTP information has a certain length of data transmission, you can't easily stop it at any time, and it is expensive to interrupt the TCP connection. RST_STREAM using HTTP2 will easily stop a message transmission, start new information, and improve bandwidth utilization efficiency without interrupting the connection. # 4. Server-side push: the Server Push client requests a resource X, and the server determines that the client may also need resource Z, and pushes the resource Z to the client without asking the client beforehand. After receiving it, the client can cache it for later use. Priority and dependency each stream has its own priority, indicating which stream is the most important, and the client specifies which stream is the most important, with some dependent parameters, so that one stream can rely on another. The priority level can be changed dynamically at run time, and when the user scrolls the page, you can tell the browser which image is the most important, or you can filter through a set of streams to suddenly catch the key stream.

HTTP/1.1 detailed explanation

HTTP/1.1 is the third version of HTTP protocol and the mainstream version of HTTP protocol at present.

HTTP 1.1introduces many key performance optimizations: keepalive connection, request pipeline, chunked encoded transfer, byte range request, etc.

1. Persistent Connection (keepalive connection)

Allows the HTTP device to keep the TCP connection open after the transaction ends, so that future HTTP requests reuse the current connection until the client or server decides to close it. HTTP1.1 versus HTTP1.0? Using persistent connections in HTTP1.0 requires adding the request header Connection: Keep-Alive, while all connections in HTTP 1.1 are persistent connections by default, unless specifically stated that it is not supported (HTTP request message header with Connection: close)

2. Pipelining (request pipeline)

# how the request pipeline works A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received. A client that supports persistent connections can "pipeline" its requests (that is, send multiple requests without waiting for each response). The server must send responses to requests in the same order as they were received.

After the client sends the request, it does not need to wait for the return result, it can send the second request directly, but the request sent first must receive the return result first, just like the production line of the convenient factory. There is definitely not a bag of instant noodles on the machine.

3. Chunked coding transmission

# 1. Introduce the coding to transmit the entity in blocks and mark the length one by one until a length of 0 indicates the end of the transmission, which is especially useful when the length of the entity is unknown (such as data dynamically generated by the database) # 2. Transmission coding and block coding when Transfer-Encoding: chunked is included in the response header, which represents block coding, the "message" is divided into blocks of known size, and the blocks are sent next to each other. This eliminates the need to know the size of the entire message before it is sent, and it also means that there is no need to write back to the Content-Length header. # 3. When a multipart transmission application uses a persistent connection, the size of the body content must be calculated before the server sends the body content, and then put it in the response header (Content-Length: the number of bytes of the body) and send it to the client. If the server creates content dynamically, you may not know the body size before sending it, and block coding is to solve this situation: the server sends the body block by block, indicating the size of each block. The server then uses a block of size 0 as the end block. In preparation for the next response, Content-Length is no longer needed in the response header unless the chunking code Transfer-Encoding: chunked is used, the response header must use the Content-Length header. Excerpt from HTTP/1.1: https://tools.ietf.org/html/rfc2616, # 4, about Content-Length header: if the request header contains Accept-Encoding': 'gzip', the server will compress the content and return it. The Content-Length length of the content is the compressed length. If the request header does not contain Accept-Encoding':' gzip', the server will not adopt gzip compression. At the same time, our server setting does not carry out block coding. So it is necessary to return the Content-Length header of the response header, but the size of this value is definitely the file size that has not been compressed.

4. Byte range request

HTTP1.1 supports the delivery of part of the content. For example, when the client already has part of the content, in order to save bandwidth, you can request only part of it from the server. This function is achieved by introducing the range header field in the request message, which allows only a portion of the resource to be requested. The Content-Range header field declares the offset value and length of this part of the object returned in the response message. If the server returns the content of the scope requested by the object accordingly, the response code 206A (Partial Content)

HTTP 1.1 also adds the following features:

# 1. Both the request message and the response message should support the Host header domain that each server is bound to a unique IP address in HTTP1.0, so the URL in the request message does not pass the host name (hostname). However, with the development of virtual hosting technology, there can be multiple virtual hosts (Multi-homed Web Servers) on one physical server, and they share an IP address. Therefore, the introduction of host head is very necessary. # 2. Add a batch of Request methodHTTP1.1 and add OPTIONS,PUT, DELETE, TRACE, CONNECT method # 3. Cache processing HTTP/1.1 adds some new features of cache on the basis of 1.0, introduces entity tags, generally known as e-tags, and adds a more powerful Cache-Control header. 2. Request Request for HTTP protocol

1. The requested URL

# 1. What is URL? URL is a special type of URI, which contains enough information to find a resource URL, the full name is UniformResourceLocator, the Chinese is called uniform resource locator, is the address used to identify a resource on the Internet. # 2. Take the following URL as an example to introduce the components of an ordinary URL: http://www.aspxfans.com:8080/news/index.asp?boardID=5&ID=24618&page=1#name A complete URL includes the following parts: # A. Protocol part: http:// the protocol part of the URL is "http:", and the "/ /" after "HTTP" is the delimiter. This means that the web page uses the HTTP protocol. A variety of protocols can be used in Internet, such as HTTP,FTP and so on. If you don't write it, the browser will complete it automatically, but it must have # B. Domain name part: www.aspxfans.com a URL, you can also use IP address as the domain name = = > there must be # C. Port section: 8080 followed by the domain name is the port, the domain name and the port use ":" as the separator. The port is not a required part of a URL. If the port part is omitted, the default port 80roomD. Virtual directory section: / news/ starts with the first "/" after the domain name and ends with the last "/". It is the virtual directory part. The virtual directory is also not a required part of a URL. # E. File name section: index.asp starts from the last "/" after the domain name to "?" So far, it is the file name part, if there is no "?", it starts from the last "/" after the domain name to the "#", it is the file part, if there is no "?" And "#", then it is the file name part from the beginning to the end of the last "/" after the domain name. The filename section is also not a required part of a URL. If it is omitted, the default filename # F. Parameter part: boardID=5&ID=24618&page=1 from "?" The part between the beginning and the "#" is the parameter part, also known as the search part and the query part. Parameters can have multiple parameters, with "&" as the delimiter between parameters. = = > the parameter part is not necessary # G. Anchor part: # name is the anchor part from "#" to the end. The anchor part is not a necessary part of a URL.

What's the difference between URL and URI,URN?

# 1. URI is a uniform resource identifier, a uniform resource identifier that is used to uniquely identify a resource. Every kind of resource available on Web, such as HTML documents, images, video clips, programs, etc., is generally composed of three parts: the naming mechanism of ① accessing resources, the naming mechanism of ② to store resources, and the name of ③ resources themselves, represented by paths, with emphasis on resources. # 2. URL is a uniform resource locator, uniform resource locator, it is a specific URI, that is, URL can be used to identify a resource, but also indicates how to locate the resource. URL is a string used to describe information resources on Internet. It is mainly used in various WWW client programs and server programs, especially the famous Mosaic. URL can be used to describe a variety of information resources, including files, server addresses and directories, in a unified format. URL generally consists of three parts: the ① protocol (or service mode) ② stores the host IP address (and sometimes the port number) of the resource, and the specific address of the ③ host resource. Such as directory and file name # 3, URN,uniform resource name, uniform resource naming, is to identify resources by name, such as mailto:java-net@java.sun.com. URI is an abstract, high-level concept to define a unified resource identification, while URL and URN are specific ways of resource identification. Both URL and URN are URI. Generally speaking, every URL is a URI, but not necessarily every URI is a URL. This is because URI also includes a subclass, uniform Resource name (URN), which names resources but does not specify how to locate them. The mailto, news, and isbn URI above are all examples of URN. In Java's URI, an URI instance can represent either absolute or relative, as long as it conforms to the syntax rules of URI. The URL class is not only semantic, but also contains information to locate the resource, so it cannot be relative. In the Java class library, the URI class does not contain any methods to access resources, and its only function is to parse. Instead, the URL class can open a stream to the resource. # the difference between them is a bit like the property selector of CSS, it should be said that they are all used for location filtering, one is to locate in the web page, and the other is to locate resources on a global scale.

2. Format of Request request

The request message format for the client to send a HTTP request to the server is composed of four parts: request line (request line), request header (header), blank line and request data.

The request line begins with a method GET or POST, separated by a space, followed by the requested URI and protocol version. The detailed explanation is as follows: GET / mayite/p/7278389.html HTTP/1.1Host: www.cnblogs.com Connection: keep-alive Cache-Control: max-age=0 Upgrade-Insecure-Requests: 1User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10 / 12 / 6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/* Q=0.8Accept-Encoding: gzip, deflate Accept-Language: zh-CN,zh Part 1: request line, which describes the type of request, the resource to be accessed, and the HTTP version used. Get indicates that the request type is GET/mayite/p/7278389.html. The last part of the line for the resource to be accessed is the HTTP1.1 version # part 2: the request header starts from the second line, immediately after the request line (that is, the first line) Used to indicate the additional information to be used by the server HOST will indicate the destination of the request. User-Agent, which can be accessed by server-side and client-side scripts, is an important basis for browser type detection logic. This information is defined by your browser and automatically sent in each request, etc. # part 3: blank lines, blank lines after the request header are required, even if the request data in the fourth part is empty, there must be blank lines. # part IV: the request data is also called the body, and you can add any other data. The request data for this example is empty. Only the POST method has the request body. You can log in to a website with a browser and enter the wrong account password to grab the POST request POST / HTTP1.1Host:www.wrox.com User-Agent:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022) Content-Type:application/x-www-form-urlencoded Content-Length:40Connection: Keep-Alive name=Professional%20Ajax&publisher=Wiley

3. HTTP request method

# 1. The Http protocol defines many ways to interact with the server. HTTP1.0 defines three request methods: GET, POST and HEAD methods. HTTP1.1 has added five request methods: OPTIONS, PUT, DELETE, TRACE and CONNECT methods. # 2. To understand the general meaning of each method, GET requests the specified page information and returns the entity body. HEAD is similar to a get request, except that there is no specific content in the returned response, which is used to obtain the header POST to submit data to the specified resource for processing the request (such as submitting a form or uploading a file). The data is contained in the request body. POST requests may result in the creation of new resources and / or modification of existing resources. The data that PUT sends from the client to the server replaces the contents of the specified document. DELETE requests the server to delete the specified page. The CONNECT HTTP/1.1 protocol is reserved for proxy servers that can change connections to pipelines. OPTIONS allows clients to view the performance of the server. TRACE echoes requests received by the server, mainly for testing or diagnosis. # 3. A URL address is used to describe a resource on a network, and the four most basic methods in HTTP, GET, POST, PUT, and DELETE, correspond to four operations to query, modify, add, and delete the resource. # 4. The most common ones are GET and POST. GET is generally used to obtain / query resource information, while POST is generally used to update resource information.

The difference between GET request and POST request

To put it simply, the get request will be reflected on url, but post will not, of course, only on the surface.

# 1. Difference 1: parameters are organized differently. Data submitted by GET will be placed after URL. Split the URL and transmit data, and the parameters are connected by &, for example: login.action?name=hyddd&password=idontknow&verify=%E4%BD%A0% E5%A5%BD. If the data is English alphabetic / numeric, send as is, if it is a space, convert it to +, if it is Chinese / other characters, then encrypt the string directly with BASE64, and get such as:% E4%BD%A0%E5%A5%BD, where the XX in% XX is the ASCII represented by the symbol in hexadecimal. The POST method puts the submitted data in the Body of the HTTP package. Therefore, the data submitted by GET will be displayed in the address bar, while POST submission, the address bar will not change # 2, difference 2: transfer data size limit first states: the HTTP protocol does not limit the size of the transmitted data, and the HTTP protocol specification does not limit the URL length. The main limitations in the actual development are: GET: specific browsers and servers have restrictions on URL length, for example, IE limits the length of URL to 2083 bytes (2K+35). For other browsers, such as Netscape, FireFox, etc., there is theoretically no length limit, which depends on the support of the operating system. Therefore, when the GET is submitted, the transfer of data is limited by the length of the URL. POST: since values are not passed through URL, the data is theoretically unlimited. However, in practice, each WEB server will limit the size of the data submitted by post, and Apache and IIS6 all have their own configurations. It can be summarized as follows: the size of the data submitted by GET is limited (because the browser has a limit on the length of URL), while the data submitted by the POST method has no limit. The GET mode requires the use of Request.QueryString to get the value of the variable, while the POST method uses Request.Form to get the value of the variable. # 3. Difference 3: security of POST is higher than that of GET. For example, if you submit data through GET, the user name and password will appear plainly on URL, because (1) the login page may be cached by the browser; (2) if others view the history of the browser, then others can get your account and password. In addition, submitting data using GET may also cause Cross-site request forgery to attack the response Response of the three HTTP protocols.

After receiving and processing the request sent by the client, the server will return a HTTP response message Response

The HTTP response also consists of four parts: the status line, the message header, the blank line, and the response body.

# part 1: status line, which consists of three parts: HTTP protocol version number, status code and status message. The first behavior status line, (HTTP/1.1) indicates that the HTTP version is version 1.1, the status code is 200, and the status message is (ok) # part 2: message header, used to describe some additional information to be used by the client, Date: date and time when the response was generated Content-Type: HTML (text/html) that specifies the MIME type, encoding type is UTF-8 # part III: blank lines, blank lines after the message header are required # part IV: response body, text information returned by the server to the client. The html part after the blank line is the response text, and the browser will render this part to the user's client browser, which will produce the effect of a web page.

The status code consists of three digits. The first number defines the category of the response, which is divided into five categories: 1xx: indication message-indicates that the request has been received Continue processing 2xx: successful-indicates that the request has been successfully received, understood, accepted 3xx: redirect-further operation must be performed to complete the request 4xx: client error-request syntax error or request unable to implement 5xx: server side error-server failed to implement legitimate request Common status code: 200OK / / client request succeeded 400Bad Request / / client request has syntax error Cannot be understood by the server. This status code must be used with the WWW-Authenticate header domain to receive the request using the 403 Unauthorized / / server, but the denied service 404 Not Found / / request resource does not exist. Eg: entered the wrong URL500 Internal Server Error / / the server has an unexpected error 503 Server Unavailable / / the server cannot currently process the client's request The full workflow of the four HTTP protocols may return to normal after a period of time.

The HTTP protocol defines how the Web client requests Web pages from the Web server and how the server delivers the Web pages to the client. The HTTP protocol adopts the request / response model. The client sends a request message to the server, which contains the request method, URL, protocol version, request header and request data. The server responds with a status line, including the version of the protocol, success or error code, server information, response header, and response data.

The following are the steps for the HTTP request / response:

1. The client connects to the Web server

A HTTP client, usually a browser, establishes a TCP socket connection to the HTTP port of the Web server (default is 80). For example, http://www.oakcms.cn.

2. Send HTTP request

Through the TCP socket, the client sends a text request message to the Web server. A request message consists of four parts: the request line, the request header, the blank line and the request data.

3. The server accepts the request and returns a HTTP response

The Web server parses the request and locates the request resource. The server writes a copy of the resource to the TCP socket and the client reads it. A response consists of four parts: the status line, the response header, the blank line and the response data.

4. Release the connection TCP connection

If the connection mode is close, the server actively closes the TCP connection, and the client passively closes the connection, releasing the TCP connection; if the connection mode is keepalive, the connection will remain for a period of time during which you can continue to receive requests

5. Client browser parses HTML content

The client browser first parses the status line to see the status code indicating whether the request was successful or not. Each response header is then parsed, and the response header tells the following HTML document and its character set of several bytes. The client browser reads the response data HTML, formats it according to the syntax of HTML, and displays it in the browser window.

Key summary of HTTP protocol # 1. When a simple and fast customer requests a service from the server, it only needs to transmit the request method and path. The commonly used request methods are GET, HEAD, and POST. Each method specifies a different type of contact between the customer and the server. Because the HTTP protocol is simple, the program scale of the HTTP server is small, so the communication speed is very fast. # 2. Flexible HTTP allows any type of data object to be transferred. The type being transmitted is marked by Content-Type. # 3. Connectionless HTTP connectionless means that when a client requests the same resource several times in a short period of time, the server cannot tell whether it has responded to the user's request. So every time we send a http request, we need to initiate a TCP request to the server in advance and go through the process of "three-way handshake". This is quite expensive for servers with high traffic. This is the disadvantage of http no link. For http no connection, people have designed non-persistent connection and persistent connection. In fact, about the http protocol non-persistent connections and persistent connections are for the tcp protocol. When the client / server interaction runs over the TCP protocol, the application uses a non-persistent connection when each request / response pair of the application is connected over a different TCP, and when each request / response pair of the application is sent over the same TCP connection, the application uses a persistent connection. The total time it takes for a non-persistent connection to request a HTTP request / response = the client issues a connection + the request message occurs + the time when the server transmits the HTML file the persistent connection server keeps the TCP connection open after sending the response. Subsequent request and response messages between the same client and server are transmitted over the same connection. There is no need to establish a tcp connection # 4. Stateless means that http is a stateless protocol. The implication is that the http protocol cannot save client information. The advantage of statelessness is that it responds faster when the server does not need previous information. The disadvantage of statelessness is that the lack of state means that if subsequent processing requires previous information, it must be retransmitted. This may lead to an increase in the amount of data transferred per connection. Http statelessness hinders the implementation of interactive applications. For example, record which web pages the user visits, determine whether the user has permission to access, and so on. As a result, two technologies for maintaining the HTTP state have emerged, one is Cookie, and the other is Session. # 5. Support BUnip S and Cmax S modes. The above is the content of this article on "what is the workflow of the HTTP agreement?" I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report