How to parse HTTP 07/19 Update SLTechnology News&Howtos

How to parse HTTP

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about how to analyze HTTP. Many people may not know much about it. In order to make you understand better, the editor has summarized the following for you. I hope you can get something according to this article.

HTTP protocol is the abbreviation of Hyper Text Transfer Protocol (Hypertext transfer Protocol). It is a transport protocol used to transfer hypertext from a WWW:World Wide Web server to a local browser.

HTTP is a communication protocol based on TCP/IP to transfer data (HTML files, picture files, query results, etc.).

HTTP is an object-oriented protocol belonging to the application layer. Because of its simple and fast way, it is suitable for distributed hypermedia information systems. It was put forward in 1990 and has been continuously improved and expanded after several years of use and development. At present, the sixth version of HTTP/1.0 is used in WWW, the standardization of HTTP/1.1 is in progress, and the proposal of HTTP-NG (Next Generation of HTTP) has been put forward.

The HTTP protocol works on the client-server architecture. As a HTTP client, the browser sends all requests to the HTTP server, the WEB server, through URL. After receiving the request, the Web server sends response information to the client.

GET / / *, * / *; Q = the first part: status line, which consists of HTTP protocol version number, status code and status message.

The first behavior status line, (HTTP/1.1) indicates that the HTTP version is version 1.1, the status code is 200, and the status message is (ok)

The second part: the message header, which is used to describe some additional information to be used by the client.

The second and third behavior message headers

Date: date and time when the response was generated; Content-Type: HTML (text/html) with the specified MIME type, and the encoding type is UTF-8

The third part: blank line, the blank line after the message header is necessary part IV: the response body, the text message returned by the server to the client.

The html section after the blank line is the response body.

Status code of HTTP

The status code consists of three digits, the first of which defines the category of response, which is divided into five categories:

1xx: indicates that the request has been received and continues to process 2xx: successful-indicates that the request has been successfully received, understood, and accepted 3xx: redirect-further operation must be performed to complete the request 4xx: client error-request has syntax error or request cannot be implemented 5xx: server side error-server failed to fulfill legitimate request

Common status codes:

/ / the client request successfully DELETE requests the server to delete the specified page. The CONNECT HTTP/1.1 protocol is reserved for proxy servers that can change connections to pipelines. OPTIONS allows clients to view the performance of the server. TRACE echoes requests received by the server, mainly for testing or diagnosis. How HTTP works

The HTTP protocol defines how the Web client requests Web pages from the Web server and how the server delivers the Web pages to the client. The HTTP protocol adopts the request / response model. The client sends a request message to the server, which contains the request method, URL, protocol version, request header and request data. The server responds with a status line, including the version of the protocol, success or error code, server information, response header, and response data.

The following are the steps for the HTTP request / response:

1. The client connects to the Web server

A HTTP client, usually a browser, establishes a TCP socket connection to the HTTP port of the Web server (default is 80). For example, http://www.oakcms.cn.

2. Send HTTP request

Through the TCP socket, the client sends a text request message to the Web server. A request message consists of four parts: the request line, the request header, the blank line and the request data.

3. The server accepts the request and returns a HTTP response

The Web server parses the request and locates the request resource. The server writes a copy of the resource to the TCP socket and the client reads it. A response consists of four parts: the status line, the response header, the blank line and the response data.

4. Release the connection TCP connection

If the connection mode is close, the server actively closes the TCP connection, and the client passively closes the connection, releasing the TCP connection; if the connection mode is keepalive, the connection will remain for a period of time during which you can continue to receive requests

5. Client browser parses HTML content

The client browser first parses the status line to see the status code indicating whether the request was successful or not. Each response header is then parsed, and the response header tells the following HTML document and its character set of several bytes. The client browser reads the response data HTML, formats it according to the syntax of HTML, and displays it in the browser window.

For example, type URL in the browser address bar, and press enter to go through the following process:

1. The browser requests the DNS server to resolve the IP address corresponding to the domain name in the URL.

2. After resolving the IP address, establish a TCP connection with the server according to the IP address and the default port 80.

3. The browser issues a HTTP request to read the file (the file at the back of the domain name in URL). The request message is sent to the server as the data of the third message of the TCP three-way handshake.

4. The server responds to the browser request and sends the corresponding html text to the browser.

5. Release the TCP connection

6. The browser will send the html text and display the content.

Differences between GET and POST requests

GET request

GET / books/?sex=man&name=Professional HTTP/1.1Host: www.wrox.comUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050225 Firefox/1.0.1Connection: Keep-Alive

Notice that the last line is blank.

POST request

POST / HTTP/1.1Host: www.wrox.comUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050225 Firefox/1.0.1Content-Type: application/x-www-form-urlencodedContent-Length: 40Connection: Keep-Alivename=Professional%20Ajax&publisher=Wiley

1. Submitted by GET, the requested data will be attached to the URL (that is, the data will be placed in the HTTP protocol header). Split URL and transfer data, multiple parameters are connected with &; for example: login.action?name=hyddd&password=idontknow&verify=%E4%BD%A0% E5%A5%BD. If the data is English alphabetic / numeric, send as is, if it is a space, convert it to +, if it is Chinese / other characters, then encrypt the string directly with BASE64, and get such as:% E4%BD%A0%E5%A5%BD, where the XX in% XX is the ASCII represented by the symbol in hexadecimal.

POST submission: places the submitted data in the body of the HTTP package. In the above example, the red font indicates the actual transmission data.

Therefore, the data submitted by GET will be displayed in the address bar, while the address bar will not change when submitted by POST.

2. The size of the transmitted data: first of all, it is declared that the HTTP protocol does not limit the size of the transmitted data, and the HTTP protocol specification does not limit the length of the URL.

The main limitations in actual development are:

GET: specific browsers and servers have limits on URL length. For example, IE limits URL length to 2083 bytes (2K+35). For other browsers, such as Netscape, FireFox, etc., there is theoretically no length limit, which depends on the support of the operating system.

Therefore, when the GET is submitted, the transfer of data is limited by the length of the URL.

POST: since values are not passed through URL, the data is theoretically unlimited. However, in practice, each WEB server will limit the size of the data submitted by post, and Apache and IIS6 all have their own configurations.

3. Security

POST is more secure than GET. For example, if you submit data through GET, the user name and password will appear in clear text on URL, because (1) the login page may be cached by the browser; (2) if others view the history of the browser, then others can get your account and password. In addition, submitting data using GET may also cause Cross-site request forgery attacks.

4. The Http get,post,soap protocol runs on http.

(1) get: the request parameter is attached to the URL as a sequence of key/value pairs (query string)

The length of the query string is limited by web browsers and web servers (for example, IE supports up to 2048 characters), so it is not suitable for transmitting large datasets and it is not secure.

(2) post: the request parameter is transferred in a different part of the http header (named entity body), which is used to transfer form information, so Content-type must be set to: application/x-www-form- urlencoded. Post is designed to support user fields on web forms, and its parameters are also transmitted as key/value pairs.

However: it does not support complex data types because post does not define the semantics and rules for transferring data structures.

(3) soap: a dedicated version of http post that follows a special xml message format

Content-type is set to: text/xml any data can be XML.

The Http protocol defines many ways to interact with the server, the most basic of which is GET,POST,PUT,DELETE. A URL address is used to describe a resource on a network, and the GET, POST, PUT and DELETE in HTTP correspond to four operations of searching, changing, adding and deleting the resource. The most common ones are GET and POST. GET is generally used to obtain / query resource information, while POST is generally used to update resource information.

Let's look at the difference between GET and POST.

The data submitted by GET will be placed after URL. Split URL and transmit data, and parameters are connected by &, such as EditPosts.aspx?name=test1&id=123456. The POST method puts the submitted data in the Body of the HTTP package.

The size of the data submitted by GET is limited (because the browser has a limit on the length of URL), while the data submitted by the POST method has no limit.

The GET mode requires the use of Request.QueryString to get the value of the variable, while the POST method uses Request.Form to get the value of the variable.

Submitting data by GET will bring security problems, such as a login page. When data is submitted through GET, the user name and password will appear on the URL. If the page can be cached or other people can access the machine, the user's account number and password can be obtained from the history.

After reading the above, do you have any further understanding of how to parse HTTP? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.