Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed explanation of the working principle of HTTP

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "detailed explanation of the working principle of HTTP". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Brief introduction to HTTP

HTTP protocol (HyperText Transfer Protocol, Hypertext transfer Protocol) is a transport protocol used to transfer hypertext from a WWW server to a local browser. It can make browsers more efficient and reduce network transmission. It not only ensures that the computer transmits hypertext documents correctly and quickly, but also determines which part of the transferred document and which part of the content is displayed first (for example, text precedes graphics) and so on.

Before we understand how HTTP works, let's take a look at the communication between computers.

two。 Communication between computers

The key technology of the Internet is TCP/IP protocol. The communication between the two computers is carried out on the Internet through the TCP/IP protocol. In fact, these are two agreements:

TCP: Transmission Control Protocol Transmission Control Protocol and IP:Internet Protocol Internet Protocol.

IP: communication between computers

IP protocol is a communication mechanism that computers use to recognize each other. Each computer has an IP. It is used to identify this computer on internet. IP is responsible for sending and receiving packets over the Internet. Through IP, messages (or other data) are divided into small, independent packets and transmitted between computers over the Internet. IP is responsible for routing each packet to its destination.

The IP protocol simply allows computers to send messages to each other, but it does not check that messages arrive in the order in which they are sent and are not corrupted (only critical header data is checked). In order to provide message verification function, the transmission control protocol TCP.

TCP: communication between applications

TCP ensures that the packets arrive in the correct order and attempts to verify that the contents of the packets have not changed. TCP leads the port (port) above the IP address, which allows computers to provide a variety of services over the network. Some port numbers are reserved for different services, and these port numbers are well known.

Service or daemon: on the machine providing the service, a program listens for traffic on a specific port. For example, most email traffic flows out now on port 25, and HTTP traffic for wwww flows out on port 80.

When an application wants to communicate with another application through TCP, it sends a communication request. The request must be sent to an exact address. After the two parties shake hands, TCP establishes a full-duplex (full-duplex) communication between the two applications, occupying the entire communication line between the two computers. TCP is used to control data transfer from the application to the network. TCP is responsible for splitting the data into IP packets before they are transmitted, and then reassembling them when they arrive.

TCP/IP means that the two protocols of TCP and IP work together and have a relationship between the upper and lower levels.

TCP is responsible for the communication between applications (such as your browser) and web software. IP is responsible for communication between computers. TCP is responsible for dividing the data and loading it into IP packets, and IP is responsible for sending the packets to the recipient. During the transmission process, the IP router is responsible for correctly addressing according to traffic, errors in the network, or other parameters, and then reassembling them when they arrive.

3. The protocol layer where HTTP protocol is located

HTTP is based on the TCP protocol. The corresponding protocols at each layer of the TCP/IP protocol reference model are shown in the following figure, where HTTP is the protocol of the application layer.

4. HTTP request response model

HTTP, which consists of requests and responses, is a standard client-server model. In the HTTP protocol, the client always initiates the request and the server sends back the response. See the following figure:

HTTP is a stateless protocol. Stateless means that there is no need to establish a persistent connection between the client (Web browser) and the server, which means that when a client sends a request to the server, and then the server returns a response (response), the connection is closed and no information about the connection is retained on the server. Http follows the request (Request) / reply (Response) model. The client (browser) sends a request to the server, and the server processes the request and returns the appropriate reply. All HTTP connections are constructed into a set of requests and replies.

5. HTTP working process

A HTTP operation is called a transaction, and the whole process is as follows:

1), address resolution

For example, use a client browser to request this page: http://localhost.com:8080/index.htm

The protocol name, hostname, port, object path and other parts are decomposed. For our address, the parsing result is as follows:

Protocol name: http

Hostname: localhost.com

Port: 8080

Object path: / index.htm

In this step, you need the domain name system DNS to resolve the domain name localhost.com and get the IP address of the host.

2) encapsulate the HTTP request packet

Combine the above part with the local information and encapsulate it into a HTTP request packet.

3), encapsulated into a TCP packet to establish a TCP connection (TCP's three-way handshake)

Before the HTTP work begins, the client (Web browser) must first establish a connection with the server through the network, which is completed through TCP, which together with the IP protocol builds Internet, that is, the famous TCP/IP protocol family, so Internet is also known as TCP/IP network. HTTP is a higher-level application layer protocol than TCP. According to the rules, only after the establishment of the lower-layer protocol can the connection of the higher-layer protocol be carried out. Therefore, the TCP connection should be established first. The port number of the TCP connection is 80. This is port 8080.

4) the client sends request commands

After the connection is established, the client sends a request to the server in the format of uniform Resource Identifier (URL), protocol version number, followed by MIME information including request modifiers, client information, and content.

5) Server response

After receiving the request, the server gives the corresponding response information in the format of a status line, including the protocol version number of the information, a success or error code, followed by MIME information, including server information, entity information and possible content.

The entity message is that after the server sends the header information to the browser, it sends a blank line to indicate that the header information is sent to this end, and then it sends the actual data requested by the user in the format described in the Content-Type response header message.

6) the server closes the TCP connection

In general, once the Web server sends the request data to the browser, it closes the TCP connection, and then if the browser or server adds this line of code to its header information

Connection:keep-alive

The TCP connection will remain open after it is sent, so the browser can continue to send requests over the same connection. Staying connected saves the time it takes to establish a new connection for each request and saves network bandwidth.

6. Data flow of each layer in HTTP protocol stack

First of all, let's take a look at the data organization of the client request in each layer protocol as shown below:

The server parsing the client request is the process of reverse operation, as shown in the following figure:

When the client initiates a request:

The client encapsulates the request into a http packet-> into a Tcp packet-> into an Ip packet-- > into a data frame-- > hardware converts the frame data into a bit stream (binary data)-- and finally sends it to the designated location through physical hardware (network card chip).

The server hardware first receives the bit stream. It is then converted to an ip packet. So parse the Ip packet through the ip protocol, and then find that there are tcp packets inside, parse the Tcp packet through the tcp protocol, and then find that the http packet parses the http packet through the http protocol to get the data.

7. HTTPS implementation principle

HTTPS (full name: Hypertext Transfer Protocol over Secure Socket Layer) is a security-oriented HTTP channel, which is simply the secure version of HTTP. That is, add the SSL layer under HTTP, and the security foundation of HTTPS is SSL. The port number used is 443.

SSL: secure socket layer is a secure transport protocol designed by netscape Company mainly for web. This protocol has been widely used in WEB. Certificate authentication is used to ensure that the communication data between the client and the website server is encrypted and secure.

1. Encryption and decryption algorithm:

There are two basic types of encryption and decryption algorithms:

1) symmetric encryption (symmetrcic encryption): there is only one key, the encryption and decryption is the same password, and the encryption and decryption speed is fast. Typical symmetric encryption algorithms include DES, AES,RC5,3DES and so on.

The main problem of symmetric encryption is to share the secret key. Unless your computer (client) knows the private key of another computer (server), it cannot encrypt and decrypt the communication flow. The solution to this problem is the asymmetric key.

2) asymmetric encryption: use two keys: a public key and a private key. The private key is saved by one party's password (usually by the server), and the public key can be obtained by anyone on the other side.

This kind of key appears in pairs (and the private key can not be deduced according to the public key, nor can the public key be deduced from the private key). Different keys are used for encryption and decryption (public key encryption requires private key decryption, private key encryption requires public key decryption), and the speed of symmetric encryption is relatively slow. Typical asymmetric encryption algorithms include RSA, DSA and so on.

2. The communication process of https

Let's take a look at the communication process of https:

The process is roughly as follows:

1. Connect with tcp and obtain certificates:

The SSL client establishes a connection with the server through TCP (port 443) and requests a certificate during the normal tcp connection negotiation (handshake) process.

That is, the client sends a message to the server, which contains a list of algorithms that can be implemented and other required messages. The server of SSL will respond to a data packet that determines the algorithm needed for this communication, and then the server sends its identity information back to the client in the form of a certificate. The certificate contains server information: the domain name or service address, the encrypted public key, and the authority of the certificate.

2. Client processes after receiving the certificate returned by the server:

1) verify the validity of the certificate: whether the issuing authority is legal, and use the public key of this institution to confirm whether the signature is valid, and make sure that the domain name listed in the certificate is the domain name it is connecting to. If it is a browser client, if the certificate is trusted, a small lock will be displayed in the browser bar, otherwise it will indicate that the certificate is not trusted.

2) if the certificate is confirmed to be valid, the symmetric key is generated and encrypted using the server's public key. And send it to the server.

3. After receiving the data sent by the client, the server should do the following:

Use your own private key to decrypt the information and take out the password, use the password to decrypt the handshake message sent by the client, and verify that the HASH is consistent with that sent by the browser.

Encrypt a handshake message with a password and send it to the client. The client decrypts and calculates the HASH of the handshake message. If it is consistent with the HASH sent by the server, the handshake process ends.

4. After that, all communication data will be encrypted by the random password generated by the previous browser and encrypted by the encryption algorithm.

Advantages of https communication:

1) the keys generated by the client can only be obtained by the client and the server.

2) the encrypted data can only be obtained in plaintext on the client and server side.

3) client-to-server communication is secure

3. The encryption and decryption principle of asymmetric encryption algorithm RSA.

1. Each user has a pair of private and public keys.

The private key is used for decryption and signature and is for your own use.

The public key is made public by myself, used to encrypt and verify signatures, and is for others to use.

2. When the user sends the file, it is signed with the private key and decrypted by others with the public key given by him, which can guarantee that the information is sent by him. That is, digital signature.

3. When the user accepts the file, someone else encrypts it with his public key, and he decrypts it with his private key, which ensures that the information can only be seen by him. That is, secure transmission.

4. Digital Certificate CA

The digital certificate is a digital file formed by the certificate certification authority (CA) after verifying the true identity of the certificate applicant and signing some basic information of the applicant and the applicant's public key with the root certificate of CA (equivalent to stamping the official seal of the certificate issuing machine). After CA finishes issuing the certificate, it will be published in the certificate store (directory server) of CA, which can be queried and downloaded by anyone, so the digital certificate is as public as the public key. In fact, a digital certificate is a public key authenticated by CA

5. CA authentication process

SSL two-way authentication steps:

The server side of both sides of the HTTPS communication applies for a certificate from the CA institution, which is a trusted third-party organization, which can be either a recognized authoritative enterprise or the enterprise itself. Enterprise internal systems generally use the enterprise's own authentication system. CA institutions issue root certificates, server certificates and private keys to applicants.

The clients of both sides of HTTPS communication apply for certificates from CA institutions, and CA institutions issue root certificates, client certificates and private keys to each applicant.

The client initiates a request to the server, and the server issues the server certificate to the client. After receiving the certificate, the client decrypts the certificate through the private key and uses the public key authentication certificate information in the server-side certificate to compare the messages in the certificate, such as whether the domain name and public key are consistent with the relevant messages just sent by the server. If so, the client considers the legal identity of the server

The client sends the client certificate to the server. After receiving the certificate, the server decrypts the certificate through the private key, obtains the certificate public key of the client, and uses the public key to authenticate the certificate information to confirm whether the client is legal.

The client encrypts the information through the random secret key and sends the encrypted information to the server. After the server and the client negotiate the encryption scheme, the client will generate a random secret key. Through the negotiated encryption scheme, the client encrypts the random secret key and sends the random secret key to the server. After the server receives the secret key, all communication between the two parties is encrypted through the random secret key.

7. Various length limits of HTTP

1. URL length limit

There is no limit on the length of URL in the Http1.1 protocol. As described in the RFC protocol, the HTTP protocol does not impose any restrictions on the length of the URI. The server must be able to handle any URI acceptable to the services they provide, and be able to handle infinite length URI. If the server can not handle too long URI, it should return a 414 status code.

Although the Http protocol specifies it, both Web servers and browsers have their own length limits for URI.

Server restrictions: the server types I contact most are Nginx and Tomcat. For the length limit of url, they are all limited by controlling the length of http request headers. The configuration parameter of nginx is that the request configuration parameter of large_client_header_buffers,tomcat is maxHttpHeaderSize, which can be set by yourself.

Browser limits: each browser also has a limit on the length of url. Here are the url length limits for several common browsers: (unit: characters)

IE: 2803

Firefox:65536

Chrome:8182

Safari:80000

Opera:190000

For get requests, there is no limit to the number of parameters requested within the length limit of the url.

2. Length limit of Post data

The length limit of Post data is similar to the length limit of url, and there is no length limit in Http protocol. The length limit can be realized by configuring the maximum length of http request header on the server side.

3. Length limit of Cookie

The length limit of Cookie is summarized in several aspects.

(1) the maximum number of cookie allowed by the browser in each domain is not tested by yourself. The information found on the Internet is like this.

IE: originally 20, then upgraded to 50

Firefox: 50

Opera:30

Chrome:180

Safari: unlimited

The behavior of the browser when the number of Cookie exceeds the limit: IE and Opera will use the LRU algorithm to remove the old and infrequently used Cookie, and the behavior of Firefox is to randomly kick out some Cookie values. Of course, no matter what the strategy, try not to let the number of Cookie exceed the range allowed by the browser.

(2) the maximum length of each Cookie allowed by the browser

Firefox and Safari:4079 byte

Opera:4096 byte

IE:4095 byte

(3) the limit of the length of Http request header in the server. The Cookie is attached to each http request header and passed to the server, so it is also affected by the length of the server request header.

4. Html5 LocalStorage

Html5 provides a local storage mechanism for Web applications to store data on the client side, although this is not part of the Http protocol, but with the popularity of Html5, we may need to use LocalStorage more and more, and even deal with it as much as we deal with Cookie today.

As for the length limit of LocalStorage, similar to that of Cookie, browsers also limit domain, except that cookie limits the number of users, while LocalStorage limits length:

Firefox\ Chrome\ Opera all allow the maximum length of each domain to be 5MB

But this time IE is generous, and the maximum length allowed is 10MB.

So much for the introduction of "detailed explanation of how HTTP works". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report